A short and quick answer
While waiting a fixed period of time is a bad practice, in the real world, it is hard to find a solution that works well in all cases.
In order to take a screenshot when the page is fully loaded and rendered, one of the most working combination is to set
waitUntil for the
page.goto() function to
domcontentloaded and wait a bit before taking a screenshot:
waitUntil might work for most scenarios with
networkidle2, there are caveats.
Let’s take a simple screenshot without waiting for any event and see what happens.
And as an example, I will take a screenshot of the Yahoo Finance site. It has a lot of widgets, and they are loaded asynchronously, so it will be indicative that we can’t take the screenshot right away.
Let’s take a screenshot without any waiting options:
And the result is:
You can see that the widgets on the right are not loaded, but we take a screenshot anyway. It is half-backed — it is not good. Let’s improve it.
The simplest, but not the best solution is to wait for some amount of time before taking a screenshot:
And the result is OKish:
Now, we see the widgets on the right.
By the way, the ads are probably not loaded because headless browsers might block them, and the video is not loaded because Puppeteer uses Chromium which does not support the rendering of
Why is it not good to use delay before taking a screenshot? It does not scale:
- time varies on your Internet connection;
- different sites have different loading times;
- rendering time also varies on the machine load.
You can safely use this simple approach if you need to take one or two screenshots occasionally for well-known sites and with big enough delays.
Wait until an event occurs
With some exceptions, the most optimal and bullet-proof approach is to specify the
waitUntil parameter when calling
page.goto() function accepts an instance of the
WaitForOptions type, which is defined as:
Let’s consider the definition of which accepted value in the
waitUntil property of the
load: the navigation is successful when the load even is fired;
domcontentloaded: the navigation is finished when the
DOMContentLoadedeven is fired;
networkidle0: the navigation is finished when there are no more than 0 network connections for at least
networkidle2: consider navigation to be finished when there are no more than 2 network connections for at least
You might specify an array of expected events. This way
page.goto() will resolve after all events are fired.
timeout option is supercritical. By default, it is 30000 milliseconds — 30 seconds. If events are not resolved within this time,
page.goto() will throw an error.
Difference between networkidle0 and networkidle2
Puppeteer waits for the network idle.
networkidle0 for sites that loaded once and then don’t send requests. An example is a SPA without any background activities.
networkidle2 is suitable for applications with open connections and sends requests after the page is loaded. Imagine observing a trading graph in real time on an exchange site.
There is a also separate method in
Puppeteer to wait for the network idle:
Wait until DOMContentLoaded
Let’s try to use wait for page until the
DOMContentLoaded event occurs and see if it helps, to render the Finance Yahoo page correctly:
It does not help a lot:
Probably, because after the DOMContentLoaded event occurred, they sent another request for widgets. Let’s try with both events and
And here we go:
It works fast and as we need. I chose
networkidle2 instead of
networkidle0, because they constantly send requests, and
page.goto() will throw an error on timeout.
The combination of options like
networkidle2, might work well in many cases, but not in all cases.
You still might have pages with lazy loading images, so you need to scroll to the bottom of the page and then wait until the images are loaded. And sometimes, you trap in infinite scrolling. Some pages can stop sending networking requests, and some not.
You can write standard code to handle these issues if you are working with a known set of sites. But you might trap in new problems, so test your code repeatedly and on many sites.
Wait for the page ready after a button click or a form submit
In case, if you need to wait for page to be ready a button click or a form submit, use
Puppeteer API suggests using a
Promise.All() to prevent a race condition.
A third-party API to take screenshots
A shameless promotion! In case if you don’t want to waste on handling all Puppeeter issues and scaling, feel free to use ScreenshotOne.com as a screenshot API.
Afterwords and recommendations
I hope I helped you today to solve your problem and have a nice day 👋
You also might find helpful: