logo

cmdarek

06-09-2021

Generate PDFs in Elixir

When you serch for pdf generator for Elixir you most likely are going to find solution that is based on wkhtmltopdf. There is great blog post on why using this library is considered harmful. Following advice from it we are going to use weasyprint which is BSD-licensed Python library that is almost as good as the commercial ones.

There are few possible ways to integrate with python library:

  • Make a system call
  • Expose python library as separate service and communicate with it using http
  • Use Erlang ports

We are going to use ErlPort which is a library that helps connect Elixir to a number of other programming languages. Currently supported external languages are Python and Ruby. The library uses Erlang port protocol to simplify connection between languages and Erlang external term format to set the common data types mapping. If you want to read more about ports please check Outside Elixir: running external programs with ports.

As preliminary step we have to fetch data and populate html template with it. In our example we are going to use stub data. The easiest way to generate html is to use EEx which allows you to embed Elixir code inside a string.

                    
defmodule ExPdf do
  alias ExPdf.PythonWorker

  def generate do
    get_data()
    |> populate_html()
    |> generate_pdf()
  end

  defp get_data do
    %{
      customer: %{address: "2382 Feathers Hooves Drive", full_name: "Corey G Miller"},
      order_items: [
        %{name: "bought item 1", quantity: "22", price: "10000.00"},
        %{name: "bought item 2", quantity: "2", price: "400.00"}
      ]
    }
  end

  defp populate_html(%{customer: customer, order_items: order_items}) do
    EEx.eval_file(
      Path.join([:code.priv_dir(:ex_pdf), "templates", "invoice.html"]),
      customer: customer,
      order_items: order_items
    )
  end

  defp generate_pdf(html) do
    PythonWorker.generate(html)
  end
end
                    
                    

Next we are going to create single GenServer that is going to work as a proxy between our application and python port.

                    
defmodule ExPdf.PythonWorker do
  use GenServer

  alias ExPdf.PythonPort

  def start_link(_) do
    GenServer.start_link(__MODULE__, [], name: __MODULE__)
  end

  def init(_) do
    path = Path.join([:code.priv_dir(:ex_pdf), "python"])
    pid = PythonPort.python_instance(to_charlist(path))

    {:ok, %{python_pid: pid}}
  end

  def terminate(_reason, %{python_pid: pid}) do
    :python.stop(pid)
  end

  def generate(html) do
    GenServer.call(__MODULE__, %{html: html})
  end

  def handle_call(%{html: html}, _from, %{python_pid: pid} = state) do
    PythonPort.call_python(pid, :pdf, :generate, [html])

    {:reply, :ok, state}
  end
end
                    
                    

For our python port to work we need to have python installed and we have to specify where we store our python code.

                    
defmodule ExPdf.PythonPort do
  @doc """
  ## Parameters
    - path: directory to include in python path (charlist)
  """
  def python_instance(path) when is_list(path) do
    python = '/usr/bin/python3'

    {:ok, pid} = :python.start(python: python, python_path: path)

    pid
  end

  @doc """
  Call python function using MFA format
  """
  def call_python(pid, module, function, arguments \\ []) do
    :python.call(pid, module, function, arguments)
  end
end
                    
                    

Last piece is python code that is responsible for generating pdf based on html. To have this code working we need to install weasyprint.

                    
from weasyprint import HTML

def generate(html):
    return HTML(string=html.decode('utf-8')).write_pdf()
                    
                    

You can view full code at: https://github.com/elpikel/ex_pdf