Photo by 4motions Werbeagentur on Unsplash
PDF from HTML
In Clojure and Common Lisp using python
4 min read
weasyprint in Python for a while. I love that library because it lets me typeset my PDF using simple HTML and CSS. Combining this with Django gives me a super powerful way to generate the PDFs I need as part of my application. The library is written in Python and doesn't seem to have any bindings to other languages. But that has never stopped any of us now, has it?
I recently started writing all my side projects in lisp - some in common lisp and many others in clojure. And I hit this requirement where I had to generate a PDF as part of the app. There are excellent PDF libraries in each of these environments:
cl-pdf, amongst others. But in these cases, I have to write quite a bit of code to layout the content the way I want. Given my laziness and the fact that I don't have a lot of motivation to custom layout a PDF in my side project, I wanted to take a quick way out. I want to try and use
weasyprint to do my job, but I am not going to build a micro-service to do this. The following sections show how I leveraged this library inside Clojure and Common Lisp successfully :-)
How I use Weasyprint in Python
My code for Weasyprint is pretty much directly from their documentation. I prepare my HTML using a templating framework (django usually) and then call
weasyprint functions to generate the PDF as bytes.
from weasyprint import HTML def generate_pdf(html_string): return HTML(string=html_string).write_pdf()
Using Weasyprint in Clojure
There is an excellent library in Clojureland that bridges Python and Clojure. Their motivation is perhaps to leverage the wealth of data science/machine-learning libraries in Python while managing the data itself inside Clojure. A big shoutout to the excellent folks who built
I came across this when I was trying to build machine learning code using Clojure. While I didn't need to use it as part of my machine learning explorations, I knew where I needed the library. So spent some time trying to understand how this library works. After a little effort, I managed to set up and get
weasyprint to work inside my clojure app. Here is a snippet I used.
(ns html2pdf (:require [libpython-clj2.python :as py] [libpython-clj2.require :refer [require-python]])) (require-python '[weasyprint :as wp]) (defn html2pdf [html-string] (py/->jvm (py/py.. (wp/HTML :string html-string) (write_pdf)))) (defn generate-test-pdf [html-string] (with-open [w (io/output-stream "test.pdf" :append false)] (let [pdf-string (byte-array (html2pdf html-string))] (.write w pdf-string))))
The only downside I had was this: I am used to running python over
poetry. So I had trouble sequencing
poetry environment inside my emacs to ensure the correct version of python would run. There are some solutions discussed for virtual environments, I just didn't spend enough time trying them out. For my case, I was perfectly good with installing
weasyprint to my system python to keep things simple.
Using Weasyprint in Common Lisp
Once I got the clojure <-> python bridge to work, I was convinced the good folks at Common Lisp land would have done something similar. And I wasn't wrong - there were 4 such libraries according to Awesome CL. After reading about all four options, I picked
py4cl. This seemed most suited for what I was trying to do. It worked pretty much the same way as Clojure too. Here is a snippet of how I got things working in Common Lisp.
(py4cl:python-exec "from weasyprint import HTML") (defun generate-pdf-from-html (html-string) (let ((html (py4cl:python-eval (wp:HTML :string html-string)))) (py4cl:remote-objects* (py4cl:chain html (write_pdf))))) (defun pdf-from-html (html-string) (with-open-file (pdf-file "sample.pdf" :direction :output) (loop for byte in (generate-pdf-from-html " html-string) do (write-byte byte pdf-file))))
I am sure the code could be improved. I just wrote this as a toy example. I'll probably improve it when I have a common lisp app that requires this feature.
For most of my side projects, I anyway deploy as a docker image. So cluttering my python libraries is not a big enough problem to solve immediately. Perhaps if I had to combine python and clojure in the same application, then this could become a headache. Or if I had to use non-dockerised deploy, this could be an issue. That said, the bridge libraries are a fantastic solution to my specific problem. And I hope you found this useful as well.
-  - weasyprint.org
-  - github.com/clj-pdf/clj-pdf
-  - github.com/mbattyani/cl-pdf
-  - github.com/clj-python/libpython-clj
-  - python-poetry.org