PDF from HTML

In Clojure and Common Lisp using python

I've used weasyprint[1] in Python for a while. I love that library because it lets me typeset my PDF using simple HTML and CSS. Combining this with Django gives me a super powerful way to generate the PDFs I need as part of my application. The library is written in Python and doesn't seem to have any bindings to other languages. But that has never stopped any of us now, has it?

I recently started writing all my side projects in lisp - some in common lisp and many others in clojure. And I hit this requirement where I had to generate a PDF as part of the app. There are excellent PDF libraries in each of these environments: clj-pdf[2] and cl-pdf[3], amongst others. But in these cases, I have to write quite a bit of code to layout the content the way I want. Given my laziness and the fact that I don't have a lot of motivation to custom layout a PDF in my side project, I wanted to take a quick way out. I want to try and use weasyprint to do my job, but I am not going to build a micro-service to do this. The following sections show how I leveraged this library inside Clojure and Common Lisp successfully :-)

How I use Weasyprint in Python

My code for Weasyprint is pretty much directly from their documentation. I prepare my HTML using a templating framework (django usually) and then call weasyprint functions to generate the PDF as bytes.

from weasyprint import HTML

def generate_pdf(html_string):
  return HTML(string=html_string).write_pdf()

Using Weasyprint in Clojure

There is an excellent library in Clojureland that bridges Python and Clojure. Their motivation is perhaps to leverage the wealth of data science/machine-learning libraries in Python while managing the data itself inside Clojure. A big shoutout to the excellent folks who built libpython-clj[4] :-)

I came across this when I was trying to build machine learning code using Clojure. While I didn't need to use it as part of my machine learning explorations, I knew where I needed the library. So spent some time trying to understand how this library works. After a little effort, I managed to set up and get weasyprint to work inside my clojure app. Here is a snippet I used.

  (ns html2pdf
  (:require [libpython-clj2.python :as py]
            [libpython-clj2.require :refer [require-python]]))

(require-python '[weasyprint :as wp])

(defn html2pdf [html-string]
  (py/->jvm (py/py.. (wp/HTML :string html-string)
                    (write_pdf))))

(defn generate-test-pdf [html-string]
    (with-open [w (io/output-stream "test.pdf" :append false)]
      (let [pdf-string (byte-array (html2pdf html-string))]
        (.write w pdf-string))))

The only downside I had was this: I am used to running python over poetry[5]. So I had trouble sequencing poetry environment inside my emacs to ensure the correct version of python would run. There are some solutions discussed for virtual environments, I just didn't spend enough time trying them out. For my case, I was perfectly good with installing weasyprint to my system python to keep things simple.

Using Weasyprint in Common Lisp

Once I got the clojure <-> python bridge to work, I was convinced the good folks at Common Lisp land would have done something similar. And I wasn't wrong - there were 4 such libraries according to Awesome CL. After reading about all four options, I picked py4cl[6]. This seemed most suited for what I was trying to do. It worked pretty much the same way as Clojure too. Here is a snippet of how I got things working in Common Lisp.

(py4cl:python-exec "from weasyprint import HTML")
(defun generate-pdf-from-html (html-string)
  (let ((html (py4cl:python-eval (wp:HTML :string html-string))))
    (py4cl:remote-objects* (py4cl:chain html (write_pdf)))))

(defun pdf-from-html (html-string)
  (with-open-file (pdf-file "sample.pdf" :direction :output)
    (loop for byte in (generate-pdf-from-html " html-string)
      do (write-byte byte pdf-file))))

I am sure the code could be improved. I just wrote this as a toy example. I'll probably improve it when I have a common lisp app that requires this feature.

End Notes

For most of my side projects, I anyway deploy as a docker image. So cluttering my python libraries is not a big enough problem to solve immediately. Perhaps if I had to combine python and clojure in the same application, then this could become a headache. Or if I had to use non-dockerised deploy, this could be an issue. That said, the bridge libraries are a fantastic solution to my specific problem. And I hope you found this useful as well.