How to hide cookie banners when taking a screenshot with Puppeteer

Published on Dmytro Krasun 11 min read
When taking a screenshot, you want to ensure that you take a clean screenshot without cookie banners or cookie consent forms. And in this article, I will share with you how you can do it when using Puppeteer.

There are many tactics on how you can block cookie banners:

You can skip dealing with cookie popups when using Puppeteer and instead try a straightforward screenshot API that can already do it for you with the block cookie banners parameter.

A custom logic for every site

You are lucky if you take screenshots only for a small group of sites. You can use a simple trick to disable cookie popups and banners on site.

Let’s do it for StackOverflow as an example. Let’s first take a screenshot of the site and see if there is a cookie banner:

const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({ headless: true });
try {
const page = await browser.newPage();
await page.setViewport({ width: 1280, height: 1024 });
await page.goto('https://stackoverflow.com/', { waitUntil: ['load', 'domcontentloaded'] });
await page.screenshot({ type: 'png', path: 'screenshot.png'});
} catch (e) {
console.log(e)
} finally {
await browser.close();
}
})();

In the bottom left corner, you can see a cookie consent:

The StackOverflow site with a cookie consent banner

And it is fairly easy to block. We can apply display: none !important to .js-consent-banner class or to click on “Accept all cookies”. You can use Chromium DevTools to debug and to see which classes you need to block or on which buttons you need to click:

The StackOverflow site with opened Chrome DevTools

I would accept all cookies to make sure that the site is functioning correctly:

const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({ headless: true });
try {
const page = await browser.newPage();
await page.setViewport({ width: 1280, height: 1024 });
await page.goto('https://stackoverflow.com/', { waitUntil: ['load', 'domcontentloaded'] });
// to hide banner:
// await page.waitForSelector('.js-consent-banner', { visible: true });
// await page.addStyleTag({
// content: '.js-consent-banner { display: none !important; }',
// });
// or click to "Accept all cookies"
await page.waitForSelector('.js-accept-cookies', { visible: true });
await page.click('.js-accept-cookies');
await page.screenshot({ type: 'png', path: 'screenshot.png' });
} catch (e) {
console.log(e)
} finally {
await browser.close();
}
})();

The result is that there is no cookie banner in both cases:

The StackOverflow site without the cookie banner

I recommend always accepting all cookies because some sites can even block you if you don’t accept them.

But what if you need to take screenshots of more than one site?

A custom, but generic logic for every site

You can write quite generic code that might work in many cases. Since mostly any cookie consent has a button like “Accept all cookies”, “Accept” or “Allow”. The only downside is that you might accidentally click on another button, or the button title might not be in English, and you need to adopt it. And every case is different. You will always find a corner case.

Let’s look at the CookieBot site:

The CookieBot site with a cookie banner

Let’s close allow all cookies for the CookieBot site:

const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({ headless: true });
try {
const page = await browser.newPage();
await page.setViewport({ width: 1280, height: 1024 });
await page.goto('https://www.cookiebot.com/en', { waitUntil: ['load', 'domcontentloaded', 'networkidle0'] });
await page.waitForSelector('#CybotCookiebotDialogBodyLevelButtonLevelOptinAllowAll', { visible: true });
await page.click('#CybotCookiebotDialogBodyLevelButtonLevelOptinAllowAll');
await page.screenshot({ type: 'png', path: 'screenshot.png' });
} catch (e) {
console.log(e)
} finally {
await browser.close();
}
})();

And the result is:

The CookieBot site without the cookie banner

How do we generalize the code? To both close the banner for StackOverflow, CookieBot, and new sites?

The most common thing between all cookie banners is a link or a button containing “Accept all cookies” or “Allow all” text that you might click to close the banner.

So, we can search for all buttons or links with such text and try to click on them if found. Let’s try to close the banner now for both the StackOverflow and the CookieBot site:

const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({ headless: true });
try {
const page = await browser.newPage();
await page.setViewport({ width: 1280, height: 1024 });
await page.goto('https://cookiebot.com', { waitUntil: ['load', 'domcontentloaded', 'networkidle0'] });
await page.evaluate(_ => {
const selector = 'a[id*=cookie i], a[class*=cookie i], button[id*=cookie i] , button[class*=cookie i]';
const expectedText = /^(Accept|Accept all cookies|Accept all|Allow|Allow all|Allow all cookies|OK)$/gi;
const elements = document.querySelectorAll(selector);
for (const element of elements) {
if (element.textContent.trim().match(expectedText)) {
element.click();
return;
}
}
});
await page.screenshot({ type: 'png', path: 'screenshot.png' });
} catch (e) {
console.log(e)
} finally {
await browser.close();
}
})();

The code works well both for StackOverflow and CookieBot. Let’s test on a new site that we haven’t seen before — the Hetzner site:

The Hetzner site with a cookie banner and overlay

It works, but since Hetzner animates closing the cookie banner, we need to add a delay to wait before taking a screenshot:

const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({ headless: true });
try {
const page = await browser.newPage();
await page.setViewport({ width: 1280, height: 1024 });
await page.goto('https://www.hetzner.com/', { waitUntil: ['load', 'domcontentloaded', 'networkidle0'] });
await page.evaluate(_ => {
const selector = 'a[id*=cookie i], a[class*=cookie i], button[id*=cookie i] , button[class*=cookie i]';
const expectedText = /^(Accept|Accept all cookies|Accept all|Allow|Allow all|Allow all cookies|Ok)$/gi;
const elements = document.querySelectorAll(selector);
for (const element of elements) {
if (element.textContent.trim().match(expectedText)) {
element.click();
break;
}
}
});
await wait(2000);
await page.screenshot({ type: 'png', path: 'screenshot.png' });
} catch (e) {
console.log(e)
} finally {
await browser.close();
}
})();
async function wait(delay) {
return new Promise(function (resolve, reject) {
setTimeout(resolve, delay);
});
}

And the result is:

The Hetzner site without a cookie banner

But, now you get the idea. You will need to hack the script repeatedly for all the new sites. Let’s take a screenshot of the London Stock Exchange:

London Stock Exchange

As you see, the banner is not closed. We can try a new approach for closing it by finding all the buttons that contain the text we search for, but without the word “cookie” in their attributes:

const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({ headless: true });
try {
const page = await browser.newPage();
await page.setViewport({ width: 1280, height: 1024 });
await page.goto('https://www.londonstockexchange.com/', { waitUntil: ['load', 'domcontentloaded', 'networkidle0'] });
await page.evaluate(_ => {
const expectedText = /^(Accept|Accept all cookies|Accept all|Allow|Allow all|Allow all cookies|Ok)$/gi;
const clickAccept = (selector) => {
const elements = document.querySelectorAll(selector);
for (const element of elements) {
if (element.textContent.trim().match(expectedText)) {
element.click();
return true;
}
}
return false;
}
if (clickAccept('a[id*=cookie i], a[class*=cookie i], button[id*=cookie i] , button[class*=cookie i]')) {
return;
}
// a second try
clickAccept('a, button');
});
await wait(2000);
await page.screenshot({ type: 'png', path: 'screenshot.png' });
} catch (e) {
console.log(e)
} finally {
await browser.close();
}
})();
async function wait(delay) {
return new Promise(function (resolve, reject) {
setTimeout(resolve, delay);
});
}

Let’s execute it:

The London Stock Exchange site without the cookie banner

As you see, it works, but the code is computational heavy now, and it might click the button that we don’t expect to be clicked.

Let’s try another site — DigitalOcean, but first, we need to check that it renders a cookie popup:

The DigitalOcean site with a cookie banner

At the bottom you see a popup. And now, let’s try to execute the latest code that remove banners and check if it works:

The DigitalOcean site without the cookie banner

And it works perfectly! But there is a new corner case —languages! Look at Börse Frankfurt:

The Börse Frankfurt site with a cookie banner

Let’s add button text in German:

const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({ headless: true });
try {
const page = await browser.newPage();
await page.setViewport({ width: 1280, height: 1024 });
await page.goto('https://www.boerse-frankfurt.de/', { waitUntil: ['load', 'domcontentloaded'] });
await page.evaluate(_ => {
const expectedText = /^(Akzeptieren|Accept|Accept all cookies|Accept all|Allow|Allow all|Allow all cookies|Ok)$/gi;
const clickAccept = (selector) => {
const elements = document.querySelectorAll(selector);
for (const element of elements) {
if (element.textContent.trim().match(expectedText)) {
element.click();
return true;
}
}
return false;
}
if (clickAccept('a[id*=cookie i], a[class*=cookie i], button[id*=cookie i] , button[class*=cookie i]')) {
return;
}
// a second try
clickAccept('a, button');
});
await wait(2000);
await page.screenshot({ type: 'png', path: 'screenshot.png' });
} catch (e) {
console.log(e)
} finally {
await browser.close();
}
})();
async function wait(delay) {
return new Promise(function (resolve, reject) {
setTimeout(resolve, delay);
});
}

And voila:

The Börse Frankfurt site without the cookie banner

It works. It is a constant headache to adapt the code to all corner cases and all possible cookie banners.

In ScreenshotOne screenshot API, the problem is already solved, and the code is constantly updated to hide all cookie consents, including all found corner cases.

The EasyList site provides a list of rules that were originally designed for AdBlock, but thanks to Puppeteer AdBlocker library, these lists can be used from Puppeteer to block not only ads but cookies banners, GDPR overlay windows, privacy-related notices, Social Media content, in-page pop-ups and other annoyances.

Let’s install the library:

Terminal window
npm install --save @cliqz/adblocker-puppeteer cross-fetch

The library requires cross-fetch as a dependency to download blocking lists.

And try to apply it and take a screenshot for StackOverflow:

const puppeteer = require('puppeteer');
const { PuppeteerBlocker } = require('@cliqz/adblocker-puppeteer');
const fetch = require('cross-fetch');
(async () => {
const blocker = await PuppeteerBlocker.fromLists(fetch, [
'https://secure.fanboy.co.nz/fanboy-cookiemonster.txt'
]);
const browser = await puppeteer.launch({ headless: true });
try {
const page = await browser.newPage();
await blocker.enableBlockingInPage(page);
await page.setViewport({ width: 1280, height: 1024 });
await page.goto('https://stackoverflow.com/', { waitUntil: ['load', 'domcontentloaded', 'networkidle0'] });
await page.screenshot({ type: 'png', path: 'screenshot.png' });
} catch (e) {
console.log(e)
} finally {
await browser.close();
}
})();

It works:

The StackOverflow site without the cookie banner

And now let’s try to apply it to CookieBot site:

The CookieBot site with a cookie banner

As you see, it does not work. Using blocking lists is not as effective as using extensions. But there is a solution — a library for blocking CMP (consent management platform) banners!

In the case of popups like CookieBot, they are widely known as CMP (consent management platform), and there is a lot of them.

To our luck, there is a library that has support for Puppeteer — autoconsent. The library supports major CMP and allows auto-consent or opt-out.

Install the library:

Terminal window
npm i @duckduckgo/autoconsent@1.0.8

And let’s apply it and test with CookieBot:

const puppeteer = require('puppeteer');
const autoconsent = require('@duckduckgo/autoconsent/dist/autoconsent.puppet.js');
const extraRules = require('@duckduckgo/autoconsent/rules/rules.json');
const consentomatic = extraRules.consentomatic;
const rules = [
...autoconsent.rules,
...Object.keys(consentomatic).map(name => new autoconsent.ConsentOMaticCMP(`com_${name}`, consentomatic[name])),
...extraRules.autoconsent.map(spec => autoconsent.createAutoCMP(spec)),
];
(async () => {
const browser = await puppeteer.launch({ headless: true });
try {
const page = await browser.newPage();
await page.setViewport({ width: 1280, height: 1024 });
page.once('load', async () => {
const tab = autoconsent.attachToPage(page, 'https://cookiebot.com/', rules, 10);
try {
await tab.checked;
await tab.doOptIn();
} catch (e) {
console.warn(`CMP error`, e);
}
});
await page.goto('https://cookiebot.com/', { waitUntil: ['load', 'domcontentloaded', 'networkidle0'] });
await page.screenshot({ type: 'png', path: 'screenshot.png' });
} catch (e) {
console.log(e)
} finally {
await browser.close();
}
})();

And it works like a charm:

The CookieBot site without the cookie banner

It won’t be a bulletproof solution, but it might be the best one to combine blocking by lists and auto-consent:

const puppeteer = require('puppeteer');
const { PuppeteerBlocker } = require('@cliqz/adblocker-puppeteer');
const fetch = require('cross-fetch');
const autoconsent = require('@duckduckgo/autoconsent/dist/autoconsent.puppet.js');
const extraRules = require('@duckduckgo/autoconsent/rules/rules.json');
const consentomatic = extraRules.consentomatic;
const rules = [
...autoconsent.rules,
...Object.keys(consentomatic).map(name => new autoconsent.ConsentOMaticCMP(`com_${name}`, consentomatic[name])),
...extraRules.autoconsent.map(spec => autoconsent.createAutoCMP(spec)),
];
(async () => {
const blocker = await PuppeteerBlocker.fromLists(fetch, [
'https://secure.fanboy.co.nz/fanboy-cookiemonster.txt'
]);
const browser = await puppeteer.launch({ headless: true });
try {
const page = await browser.newPage();
await blocker.enableBlockingInPage(page);
await page.setViewport({ width: 1280, height: 1024 });
page.once('load', async () => {
const tab = autoconsent.attachToPage(page, 'https://cookiebot.com/', rules, 10);
try {
await tab.checked;
await tab.doOptIn();
} catch (e) {
console.warn(`CMP error`, e);
}
});
await page.goto('https://cookiebot.com/', { waitUntil: ['load', 'domcontentloaded', 'networkidle0'] });
await page.screenshot({ type: 'png', path: 'screenshot.png' });
} catch (e) {
console.log(e)
} finally {
await browser.close();
}
})();

In addition to this solution, you will need to make sure that you support customizations for sites that are not covered by this solution.

Extensions

You can use Chrome Extensions with Puppeteer, and the “I don’t care about cookies” extension is constantly updated and is supported by the author.

I don’t use it because it is a protected GPLv3 license, and I recommend having a consultation with lawyers or receiving permission from the author to use it in commercial code.

ScreenshotOne API

In ScreenshotOne (URL or HTML to image or PDF API) there is already a simple option you need to specify to block cookie banners. And nothing more. The API is ready to handle most cookie banners and is constantly updated to support new cases.

Summary

If you have enough time, money, and energy to deal with all the corner cases, go with combining automatic cookie consent libraries and blocking by rule lists.

Otherwise, you can start using ScreenshotOne API for free to see if it suits you well. This way, you delegate all the headaches of dealing with cookie banners to ScreenshotOne.

Have a nice day 👋 and you also might find helpful: