How to convert HTML to PDF in JavaScript

Posted January 2, 2023 by Dmytro Krasun ‐ 4 min read

Nowadays, you have various options to generate PDFs from HTML or any given URL: generating PDF in the browser, on the server-side (Node.js), or even using a modern and friendly API to generate PDF.

But only you who know your context and requirements can decide the best options for you. But let me guide you through existing solutions and describe their pros and cons.

Using html2canvas and jsPDF

You can use a combination of html2canvas and jsPDF libraries to generate PDF from HTML right away from the browser. Let’s do it first and then quickly discuss the pros and cons of generating PDFs in the browser.

Let’s install the necessary libraries first:

npm install jspdf dompurify html2canvas --save

And then generate a PDF document from simple HTML:

// src/index.js
import { jsPDF } from "jspdf";

function generateAndDownload() {
    const doc = new jsPDF();
    doc.html("<h1>Hello, world</h1><h2>Peace and love to everybody.</h2>", {
        callback: function (doc) {
            doc.save();
        }
    });
}

window.addEventListener('DOMContentLoaded', () => {
    document.getElementById('generate').addEventListener('click', () => {
        generateAndDownload();
    });
});

To make it work, use the following webpack.config.js:

const path = require('path');

module.exports = {
    entry: './src/index.js',
    output: {
        path: path.resolve(__dirname, 'dist'),
        filename: 'bundle.js',
    }
};

And UI:

<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8" />
  </head>
  <body>    
    <button id="generate">Generated and download a document</button>
    <script src="dist/bundle.js"></script>
  </body>
</html>

And the result is a PDF document.

Pros and cons:

  1. Rendering PDF in the browser saves computation resources for your servers. But it uses your users' computational resources, and they might have unpleasant experiences after using your application.

  2. Each browser version is different, and the html2canvas library might produce various rendering artifacts. You don’t have control over what your potential users see. Hence, it would help if you considered this when rendering HTML on the client.

  3. It is not easy to render HTML from a URL. You can’t fetch any URL in the browser due to security reasons. So you need to prepare your HTML in advance or use some proxy.

But there is a solution to avoid the pitfalls of rendering PDFs on the client side.

Using Puppeteer

To ensure that PDFs are rendered as you expect for every user, you can use Puppeteer on the server side. Puppeteer is a browser automation library written for headless browsers that support Chrome DevTools Protocol.

Install Puppeteer:

npm i puppeteer

And then:

const puppeteer = require('puppeteer');

(async () => {
    const browser = await puppeteer.launch({ headless: true });
    try {
        const page = await browser.newPage();

        await page.setViewport({ width: 1280, height: 1024 });

        await page.goto('https://example.com/', { waitUntil: ['load', 'domcontentloaded', 'networkidle0'] });
        
        await page.pdf({ type: 'png', path: 'example.pdf' });
    } catch (e) {
        console.log(e)
    } finally {
        await browser.close();
    }
})();

And the result is a PDF document.

Pros and cons of using Puppeteer for rendering PDFs:

  1. I would love to repeat that, but the massive benefit of using Puppeteer is having fine-grained control over rendering results against rendering PDFs only in the browser.

  2. Managing headless browsers is a huge pain. The browsers might have memory leaks and suddenly restarts. You need to update them to the latest versions. And this is only a small list of problems that you will encounter.

  3. Puppeteer is imperfect, and the rendering PDF has different artifacts that must be addressed.

  4. It is a computation-heavy task for servers. And scaling it for running multiple browsers can be a problem in itself.

But there is a solution to address mentioned Puppeteer issues.

Using modern and scalable URL or HTML to PDF API

If you don’t want to deal with all the burdens of managing headless browsers, you can use ScreenshotOne API to render PDFs from HTML or any URL. It is a free PDF to HTML API for up to 150 requests. The PDF generation API from ScreenshotOne is easy to use. It is scalable, covers a variety of use cases, and solves all the issues related to rendering PDFs in headless browsers.

Let’s take a look at how easy it is to render a PDF with a straightforward call:

https://api.screenshotone.com/take?url=https://example.com&format=pdf&access_key=<your access key>

And the result is a PDF document.

Summary

If you do not want to make your backend more complex, render PDFs using html2canvas and jsPDF.

If you need scaling and want to ensure that PDFs are rendered the same way for any of your users, go with the Puppeteer library.

But if you don’t want to manage headless browsers and issues when rendering PDFs, feel free to use and sign up for ScreenshotOne HTML or URL to PDF API.

  1. A complete guide on how to take screenshots with Puppeteer.

  2. How to take website screenshots with JavaScript or TypeScript (Node.js).