Join me in exploring how to find the ideal wait time or event of when to take the page screenshot with Puppeteer.
Puppeteer, wait until the page is ready!
Join me in exploring how to find the ideal wait time or event of when to take the page screenshot with Puppeteer.
While waiting a fixed period of time is a bad practice, in the real world, it is hard to find a solution that works well in all cases.
In order to take a screenshot when the page is fully loaded and rendered, one of the most working combination is to set waitUntil
for the page.goto()
function to domcontentloaded
and wait a bit before taking a screenshot:
const puppeteer = require("puppeteer");
(async () => { const browser = await puppeteer.launch({}); try { const page = await browser.newPage();
// viewport and device scale factor of my laptop await page.setViewport({ width: 2880, height: 1800, deviceScaleFactor: 2, });
await page.goto("https://finance.yahoo.com/", { timeout: 15 * 1000, waitUntil: ["domcontentloaded"], });
// wait for 2 seconds await page.waitForTimeout(2000);
await page.screenshot({ path: "finance.yahoo.com.png" }); } catch (e) { console.error(e); } finally { await browser.close(); }})();
While waitUntil
might work for most scenarios with networkidle0
or networkidle2
, there are caveats.
If you are interested, there is a deep dive guide on how to take screenshots with Puppeteer.
Let’s take a simple screenshot without waiting for any event and see what happens.
Install Puppeteer
:
npm i puppeteer
And as an example, I will take a screenshot of the Yahoo Finance site. It has a lot of widgets, and they are loaded asynchronously, so it will be indicative that we can’t take the screenshot right away.
Let’s take a screenshot without any waiting options:
const puppeteer = require("puppeteer");
(async () => { const browser = await puppeteer.launch({}); try { const page = await browser.newPage(); // viewport and device scale factor of my laptop await page.setViewport({ width: 2880, height: 1800, deviceScaleFactor: 2, });
await page.goto("https://finance.yahoo.com/");
await page.screenshot({ path: "finance.yahoo.com.png" }); } catch (e) { console.error(e); } finally { await browser.close(); }})();
And the result is:
You can see that the widgets on the right are not loaded, but we take a screenshot anyway. It is half-backed — it is not good. Let’s improve it.
The simplest, but not the best solution is to wait for some amount of time before taking a screenshot:
const puppeteer = require("puppeteer");
(async () => { const browser = await puppeteer.launch({}); try { const page = await browser.newPage();
// viewport and device scale factor of my laptop await page.setViewport({ width: 2880, height: 1800, deviceScaleFactor: 2, });
await page.goto("https://finance.yahoo.com/");
// wait for 15 seconds before taking the screenshot await page.waitForTimeout(15000);
await page.screenshot({ path: "finance.yahoo.com.png" }); } catch (e) { console.error(e); } finally { await browser.close(); }})();
And the result is OKish:
Now, we see the widgets on the right.
By the way, the ads are probably not loaded because headless browsers might block them, and the video is not loaded because Puppeteer uses Chromium which does not support the rendering of MP4
videos.
Why is it not good to use delay before taking a screenshot? It does not scale:
You can safely use this simple approach if you need to take one or two screenshots occasionally for well-known sites and with big enough delays.
With some exceptions, the most optimal and bullet-proof approach is to specify the waitUntil
parameter when calling page.goto()
.
The page.goto()
function accepts an instance of the WaitForOptions
type, which is defined as:
export declare interface WaitForOptions { /** * Maximum wait time in milliseconds, defaults to 30 seconds, pass `0` to * disable the timeout. * * @remarks * The default value can be changed by using the * {@link Page.setDefaultTimeout} or {@link Page.setDefaultNavigationTimeout} * methods. */ timeout?: number; waitUntil?: PuppeteerLifeCycleEvent | PuppeteerLifeCycleEvent[];}
export declare type PuppeteerLifeCycleEvent = 'load' | 'domcontentloaded' | 'networkidle0' | 'networkidle2';
Let’s consider the definition of which accepted value in the waitUntil
property of the WaitForOptions
type:
load
: the navigation is successful when the load even is fired;domcontentloaded
: the navigation is finished when the DOMContentLoaded
even is fired;networkidle0
: the navigation is finished when there are no more than 0 network connections for at least 500
ms;networkidle2
: consider navigation to be finished when there are no more than 2 network connections for at least 500
ms.You might specify an array of expected events. This way page.goto()
will resolve after all events are fired.
Specifying the timeout
option is supercritical. By default, it is 30000 milliseconds — 30 seconds. If events are not resolved within this time, page.goto()
will throw an error.
With options, Puppeteer
waits for the network idle.
Use networkidle0
for sites that loaded once and then don’t send requests. An example is a SPA without any background activities.
While networkidle2
is suitable for applications with open connections and sends requests after the page is loaded. Imagine observing a trading graph in real time on an exchange site.
There is a also separate method in Puppeteer
to wait for the network idle:
class Page { waitForNetworkIdle(options?: { idleTime?: number; timeout?: number; }): Promise<void>;}
Let’s try to use wait for page until the DOMContentLoaded
event occurs and see if it helps, to render the Finance Yahoo page correctly:
const puppeteer = require("puppeteer");
(async () => { const browser = await puppeteer.launch({}); try { const page = await browser.newPage(); // viewport and device scale factor of my laptop await page.setViewport({ width: 2880, height: 1800, deviceScaleFactor: 2, });
await page.goto("https://finance.yahoo.com/", { timeout: 15 * 1000, waitUntil: ["domcontentloaded"], });
await page.screenshot({ path: "finance.yahoo.com.png" }); } catch (e) { console.error(e); } finally { await browser.close(); }})();
It does not help a lot:
Probably, because after the DOMContentLoaded event occurred, they sent another request for widgets. Let’s try with both events and networkidle2
:
const puppeteer = require("puppeteer");
(async () => { const browser = await puppeteer.launch({}); try { const page = await browser.newPage();
// viewport and device scale factor of my laptop await page.setViewport({ width: 2880, height: 1800, deviceScaleFactor: 2, });
await page.goto("https://finance.yahoo.com/", { timeout: 15 * 1000, waitUntil: ["domcontentloaded", "networkidle2"], });
await page.screenshot({ path: "finance.yahoo.com.png" }); } catch (e) { console.error(e); } finally { await browser.close(); }})();
And here we go:
It works fast and as we need. I chose networkidle2
instead of networkidle0
, because they constantly send requests, and page.goto()
will throw an error on timeout.
The combination of options like domcontentloaded
and networkidle2
, might work well in many cases, but not in all cases.
You still might have pages with lazy loading images, so you need to scroll to the bottom of the page and then wait until the images are loaded. And sometimes, you trap in infinite scrolling. Some pages can stop sending networking requests, and some not.
You can write standard code to handle these issues if you are working with a known set of sites. But you might trap in new problems, so test your code repeatedly and on many sites.
In case, if you need to wait for page to be ready a button click or a form submit, use page.waitForNavigation()
const [response] = await Promise.all([ page.waitForNavigation(), page.click("a.my-link"),]);
The Puppeteer
API suggests using a Promise.All()
to prevent a race condition.
A shameless promotion! In case if you don’t want to waste on handling all Puppeeter issues and scaling, feel free to use ScreenshotOne.com as a screenshot API.
All described options are supported! And you can start for free.
I hope I helped you today to solve your problem and have a nice day 👋
You also might find helpful:
Interviews, tips, guides, industry best practices, and news.
In this article, I share how to fix the "execution context was destroyed, most likely because of a navigation" error that might happen while using Puppeteer.
Exhaustive documentation, ready SDKs, no-code tools, and other automation to help you render website screenshots and outsource all the boring work related to that to us.