Taking screenshots of multiple URLs is a common requirement for building website directories, monitoring tools, SEO analyzers, and archiving systems. While taking a single screenshot with Puppeteer is straightforward, processing hundreds or thousands of URLs requires careful consideration of concurrency, error handling, and resource management.

In this guide, I will walk you through building a robust bulk screenshot solution with Puppeteer, from a basic sequential approach to a production-ready implementation with retries and proxy support.

Setting up the project

You can skip that section if you already have a project where you want to add Puppeteer or it is already installed.

First, create a new Node.js project and install the required dependencies:

1
mkdir bulk-screenshots
2
cd bulk-screenshots
3
npm init -y
4
npm install puppeteer typescript ts-node @types/node

Create a tsconfig.json:

1
{
2
    "compilerOptions": {
3
        "target": "ES2020",
4
        "module": "commonjs",
5
        "strict": true,
6
        "esModuleInterop": true,
7
        "outDir": "./dist"
8
    }
9
}

Sequential processing

Make it work first. Make it fast. And then make it simple.

The simplest way to take bulk screenshots is to process URLs one by one:

1
import puppeteer from "puppeteer";
2

3
const urls = ["https://example.com", "https://screenshotone.com"];
4

5
async function takeScreenshots() {
6
    const browser = await puppeteer.launch();
7

8
    for (const url of urls) {
9
        const page = await browser.newPage();
10
        await page.setViewport({ width: 1280, height: 800 });
11
        await page.goto(url, { waitUntil: "networkidle0" });
12

13
        const filename = url.replace(/[^a-z0-9]/gi, "_") + ".png";
14
        await page.screenshot({ path: filename });
15
        await page.close();
16

17
        console.log(`Screenshot saved: ${filename}`);
18
    }
19

20
    await browser.close();
21
}
22

23
takeScreenshots();

This approach works but has significant limitations:

Slow execution — URLs are processed one at a time, wasting resources while waiting for pages to load.
No error handling — A single failed URL stops the entire process.
No retry mechanism — Temporary network issues cause permanent failures.

Concurrent processing

To speed up bulk screenshots, we can process multiple URLs in parallel using a worker pool pattern:

1
import puppeteer, { Browser } from "puppeteer";
2

3
const urls = ["https://example.com", "https://screenshotone.com"];
4

5
const CONCURRENCY = 3;
6

7
async function takeScreenshot(browser: Browser, url: string): Promise<void> {
8
    const page = await browser.newPage();
9

10
    try {
11
        await page.setViewport({ width: 1280, height: 800 });
12
        await page.goto(url, { waitUntil: "networkidle0", timeout: 30000 });
13

14
        const filename = url.replace(/[^a-z0-9]/gi, "_") + ".png";
15
        await page.screenshot({ path: filename });
16

17
        console.log(`Success: ${url}`);
18
    } finally {
19
        await page.close();
20
    }
21
}
22

23
async function processWithConcurrency() {
24
    const browser = await puppeteer.launch();
25
    const queue = [...urls];
26

27
    async function worker() {
28
        while (queue.length > 0) {
29
            const url = queue.shift();
30
            if (url) {
31
                try {
32
                    await takeScreenshot(browser, url);
33
                } catch (error) {
34
                    console.error(`Failed: ${url}`, error);
35
                }
36
            }
37
        }
38
    }
39

40
    const workers = Array(CONCURRENCY)
41
        .fill(null)
42
        .map(() => worker());
43
    await Promise.all(workers);
44

45
    await browser.close();
46
}
47

48
processWithConcurrency();

This implementation creates a pool of workers that continuously pull URLs from a shared queue. The CONCURRENCY constant controls how many screenshots are taken simultaneously.

Be careful with concurrency limits. Too many concurrent pages can exhaust system memory and cause crashes. Start with 3-5 concurrent workers and adjust based on your system resources.

Error handling and retries

If you plan to deploy it to production, consider covering the following errors and issues:

Network timeouts;
Pages returning error status codes (403, 429, 503);
Memory issues;

Here is an implementation with retry logic:

1
import puppeteer, { Browser, Page } from "puppeteer";
2

3
interface ScreenshotResult {
4
    url: string;
5
    success: boolean;
6
    filepath?: string;
7
    error?: string;
8
}
9

10
const MAX_RETRIES = 3;
11
const RETRY_DELAY = 1000;
12

13
async function delay(ms: number): Promise<void> {
14
    return new Promise((resolve) => setTimeout(resolve, ms));
15
}
16

17
function isRetryableError(error: unknown): boolean {
18
    if (error instanceof Error) {
19
        const retryableMessages = [
20
            "net::ERR_CONNECTION_RESET",
21
            "net::ERR_CONNECTION_REFUSED",
22
            "net::ERR_TIMED_OUT",
23
            "Navigation timeout",
24
        ];
25
        return retryableMessages.some((msg) => error.message.includes(msg));
26
    }
27
    return false;
28
}
29

30
async function takeScreenshotWithRetry(browser: Browser, url: string): Promise<ScreenshotResult> {
31
    let lastError: unknown;
32

33
    for (let attempt = 0; attempt <= MAX_RETRIES; attempt++) {
34
        const page = await browser.newPage();
35

36
        try {
37
            await page.setViewport({ width: 1280, height: 800 });
38
            await page.goto(url, {
39
                waitUntil: "networkidle0",
40
                timeout: 30000,
41
            });
42

43
            const filename = `screenshots/${url.replace(/[^a-z0-9]/gi, "_")}.png`;
44
            await page.screenshot({ path: filename });
45

46
            return {
47
                url,
48
                success: true,
49
                filepath: filename,
50
            };
51
        } catch (error) {
52
            lastError = error;
53

54
            if (!isRetryableError(error) || attempt === MAX_RETRIES) {
55
                break;
56
            }
57

58
            console.log(`Retry ${attempt + 1}/${MAX_RETRIES} for ${url}`);
59
            await delay(RETRY_DELAY * (attempt + 1));
60
        } finally {
61
            await page.close();
62
        }
63
    }
64

65
    return {
66
        url,
67
        success: false,
68
        error: lastError instanceof Error ? lastError.message : String(lastError),
69
    };
70
}

The retry logic uses exponential backoff (increasing delay between retries) and only retries on specific error types that are likely to be transient.

Using proxies for failed requests

Some websites block requests from datacenter IPs or rate-limit aggressive crawlers. Using proxies can help bypass these restrictions. Check out how to use proxy per page with Puppeteer for detailed proxy configuration.

Here is how to integrate proxy rotation into the retry logic:

1
import puppeteer, { Browser } from "puppeteer";
2

3
const proxies = [
4
    "http://user:pass@proxy1.example.com:8080",
5
    "http://user:pass@proxy2.example.com:8080",
6
];
7

8
async function takeScreenshotWithProxy(
9
    browser: Browser,
10
    url: string,
11
    proxyIndex: number
12
): Promise<ScreenshotResult> {
13
    const proxy = proxies[proxyIndex % proxies.length];
14
    const page = await browser.newPage();
15

16
    try {
17
        await page.setViewport({ width: 1280, height: 800 });
18

19
        await page.setRequestInterception(true);
20
        page.on("request", (request) => {
21
            request.continue();
22
        });
23

24
        await page.goto(url, {
25
            waitUntil: "networkidle0",
26
            timeout: 30000,
27
        });
28

29
        const filename = `screenshots/${url.replace(/[^a-z0-9]/gi, "_")}.png`;
30
        await page.screenshot({ path: filename });
31

32
        return { url, success: true, filepath: filename };
33
    } catch (error) {
34
        return {
35
            url,
36
            success: false,
37
            error: error instanceof Error ? error.message : String(error),
38
        };
39
    } finally {
40
        await page.close();
41
    }
42
}

For proper per-page proxy support, you will need the puppeteer-page-proxy package as described in the proxy guide.

Complete working example

Here is a production-ready implementation that combines all the concepts:

1
import puppeteer, { Browser } from "puppeteer";
2
import * as fs from "node:fs/promises";
3
import * as path from "node:path";
4

5
interface Config {
6
    concurrency: number;
7
    maxRetries: number;
8
    outputDirectory: string;
9
    viewport: {
10
        width: number;
11
        height: number;
12
    };
13
    timeout: number;
14
}
15

16
interface ScreenshotResult {
17
    url: string;
18
    success: boolean;
19
    filepath?: string;
20
    error?: string;
21
    attempts: number;
22
}
23

24
const config: Config = {
25
    concurrency: 3,
26
    maxRetries: 3,
27
    outputDirectory: "./screenshots",
28
    viewport: {
29
        width: 1280,
30
        height: 800,
31
    },
32
    timeout: 30000,
33
};
34

35
const urls = ["https://example.com", "https://screenshotone.com"];
36

37
function getFilename(url: string): string {
38
    const urlObj = new URL(url);
39
    const hostname = urlObj.hostname.replace(/\./g, "_");
40
    const pathname = urlObj.pathname.replace(/\//g, "_").replace(/^_/, "");
41
    return pathname ? `${hostname}${pathname}.png` : `${hostname}.png`;
42
}
43

44
function isRetryableError(error: unknown): boolean {
45
    if (!(error instanceof Error)) return false;
46
    const retryable = [
47
        "net::ERR_CONNECTION",
48
        "net::ERR_TIMED_OUT",
49
        "Navigation timeout",
50
        "Protocol error",
51
    ];
52
    return retryable.some((msg) => error.message.includes(msg));
53
}
54

55
async function delay(ms: number): Promise<void> {
56
    return new Promise((resolve) => setTimeout(resolve, ms));
57
}
58

59
async function takeScreenshot(browser: Browser, url: string): Promise<ScreenshotResult> {
60
    let lastError: unknown;
61
    let attempts = 0;
62

63
    for (let attempt = 0; attempt <= config.maxRetries; attempt++) {
64
        attempts = attempt + 1;
65
        const page = await browser.newPage();
66

67
        try {
68
            await page.setViewport(config.viewport);
69
            await page.goto(url, {
70
                waitUntil: "networkidle0",
71
                timeout: config.timeout,
72
            });
73

74
            const filename = getFilename(url);
75
            const filepath = path.join(config.outputDirectory, filename);
76
            await page.screenshot({ path: filepath });
77

78
            return {
79
                url,
80
                success: true,
81
                filepath,
82
                attempts,
83
            };
84
        } catch (error) {
85
            lastError = error;
86

87
            if (!isRetryableError(error) || attempt === config.maxRetries) {
88
                break;
89
            }
90

91
            await delay(1000 * (attempt + 1));
92
        } finally {
93
            await page.close();
94
        }
95
    }
96

97
    return {
98
        url,
99
        success: false,
100
        error: lastError instanceof Error ? lastError.message : String(lastError),
101
        attempts,
102
    };
103
}
104

105
async function processUrls(urls: string[]): Promise<ScreenshotResult[]> {
106
    const browser = await puppeteer.launch({
107
        args: ["--disable-setuid-sandbox", "--disable-dev-shm-usage", "--no-first-run"],
108
    });
109

110
    await fs.mkdir(config.outputDirectory, { recursive: true });
111

112
    const queue = [...urls];
113
    const results: ScreenshotResult[] = [];
114

115
    async function worker(): Promise<void> {
116
        while (queue.length > 0) {
117
            const url = queue.shift();
118
            if (!url) continue;
119

120
            console.log(`Processing: ${url}`);
121
            const result = await takeScreenshot(browser, url);
122
            results.push(result);
123

124
            if (result.success) {
125
                console.log(`Success: ${url} -> ${result.filepath}`);
126
            } else {
127
                console.log(`Failed: ${url} - ${result.error}`);
128
            }
129
        }
130
    }
131

132
    const workers = Array(config.concurrency)
133
        .fill(null)
134
        .map(() => worker());
135

136
    await Promise.all(workers);
137
    await browser.close();
138

139
    return results;
140
}
141

142
async function main() {
143
    console.log(`Processing ${urls.length} URLs with concurrency ${config.concurrency}`);
144

145
    const results = await processUrls(urls);
146

147
    const successful = results.filter((r) => r.success).length;
148
    const failed = results.filter((r) => !r.success).length;
149

150
    console.log(`\nCompleted: ${successful} successful, ${failed} failed`);
151

152
    if (failed > 0) {
153
        console.log("\nFailed URLs:");
154
        results.filter((r) => !r.success).forEach((r) => console.log(`  ${r.url}: ${r.error}`));
155
    }
156
}
157

158
main().catch(console.error);

Run this with npx ts-node index.ts and it will process all URLs concurrently with retry support.

In case, if you plan to render full-page screenshots, check out the complete guide on how to take full page screenshots with Puppeteer, Playwright, or Selenium for detailed instructions on handling lazy loading and other corner cases.

ScreenshotOne API as an alternative

Building and maintaining your own bulk screenshot infrastructure requires handling many edge cases: cookie banners, anti-bot protection, proxy management, browser crashes, memory leaks, and more. ScreenshotOne provides a managed API that handles all of this complexity.

You can use a bulk screenshots endpoint that can process multiple URLs in a single request or for more complex bulk processing with retries and concurrency management, check out our bulk screenshots guide.

A few nuances on why and when use ScreenshotOne:

No infrastructure to manage: o need to run and maintain headless browsers, handle crashes, or manage server resources.
Built-in caching. Screenshots are cached if requested, reducing costs for repeated requests.
Cookie banners and ads blocking — Built-in features to hide cookie banners, ads and more block ads without additional configuration compared to doing with Puppeteer.
S3 storage integration. You can upload screenshots directly to any S3-compatible storage.
Concurrency management. The API manages concurrency limits and queuing automatically.
SDKs for multiple languages and many more integrations.

But:

Monthly cost. However, there is a cost compared to self-hosted solutions, though often cheaper than running your own infrastructure at scale.
Third-party dependency. Your application depends on an external service availability.
Less browser control. Some advanced browser configurations may not be available. But you can reach to our support at support@screenshotone.com and we will try to help you as fast as possible.

Summary

Taking bulk screenshots with Puppeteer requires careful consideration of:

Concurrency: process multiple URLs in parallel but respect system limits.
Error handling: implement retry logic with exponential backoff.
Proxies: use proxy rotation for blocked or rate-limited sites.
Resource management: close pages properly and monitor memory usage.

For production workloads, consider using ScreenshotOne API which handles all these complexities out of the box, letting you focus on your application logic instead of infrastructure management.

How to take bulk screenshots with Puppeteer

Written by

Updated on

Tags

Setting up the project

Sequential processing

Concurrent processing

Error handling and retries

Using proxies for failed requests

Complete working example

ScreenshotOne API as an alternative

Summary

Read more Puppeteer guides

Puppeteer versus Playwright

Taking screenshots with Puppeteer in GIF, JP2, TIFF, AVIF, HEIF, or SVG format

Capture beyond viewport in Puppeteer and Chrome DevTools Protocol

Automate website screenshots

Integrations

Use Cases

Screenshot Tools

Customers