How to take a screenshot of a webpage in Haskell

Updated on Anthony Alaribe 4 min read
It's a widespread need to take a screenshot of a live website. On a project I worked on recently, we had a legal requirement to take screenshots of forms that our users filled, as at the time they filled the forms, for consent documentation purposes.

There are many ways to do this manually. A straightforward way is to open the form and use the shortcuts Command-Shift-3 on macOS. But this is not good enough when you need to take these screenshots automatically, based on user actions, or even just as part of batch processes. I won’t want to take 1000 screenshots manually, but I would happily write a script to automate taking those screenshots.

Let’s explore some approaches to taking a screenshot of a webpage in Haskell. We will take screenshots of the APItoolkit.io which uses AI to help you manage, monitor, and document your APIs, so you always know what’s going on in your backends and can debug issues that come up immediately.

Side note: All the code examples are standalone Haskell shell scripts. Copy them into a file on a Unix environment and execute them like any other executable shell script.

This is possible due to the line ”#!/usr/bin/env stack” at the top of the file. Note, Haskell stack is required on your local machine and can be installed via GHCup.

Using Haskell Selenium WebDriver

Selenium is a well-known player in the QA and Web testing space. And with the web driver specification, we can control any browser. We will use the webdriver-w3c dependency for this test from hackage. Feel free to install that. Since we’re using a Haskell shell script, we only need to include the dependencies on the second line with the stack resolver:

#!/usr/bin/env stack
-- stack --resolver nightly-2022-10-18 script --package "bytestring webdriver-w3c"
{-# LANGUAGE OverloadedStrings #-}
import Control.Monad.IO.Class (liftIO)
import qualified Data.ByteString as B
import Web.Api.WebDriver
main :: IO ()
main = do
execWebDriverT
defaultWebDriverConfig
(runIsolated_ defaultChromeCapabilities takePageScreenshot)
return ()
takePageScreenshot :: WebDriverT IO ()
takePageScreenshot = do
fullscreenWindow
navigateTo "https://apitoolkit.io"
img <- takeScreenshot
liftIO $ B.writeFile "screenshot.png" img
return ()

That’s pretty much all the code we need. But if we run this code, we will get an error that looks like this:

Terminal window
./Webdriver.hs
2022-10-21 00:04:53 DEBUG Request Request POST http://localhost:4444/session
{
"capabilities": {
"alwaysMatch": {
"browserName": "chrome"
}
},
"desiredCapabilities": {
"browserName": "chrome"
}
}
2022-10-21 00:04:53 ERROR Error Unable to connect to WebDriver server
2022-10-21 00:04:53 ERROR Error No session in progress

This error is because we need to run a server process which our script can interact with. This process differs for different browsers. There is geckodriver, firefoxdriver, chromedriver, etc.

First, download the chromedriver from https://chromedriver.chromium.org/, and then we can start it in a separate process with the following command:

Terminal window
chromedriver --port=4444

Port 4444 is the default port for the WebDriverConfig we used in our Haskell program. Next, we restart our Haskell shell program. It would spit out some logs, then spin up a chrome window, open the API toolkit on the window, take a screenshot and save it to screenshot.png.

The APIToolkit screenshot with Selenium WebDriver.

Using the ScreenshotOne API

ScreenshotOne is a screenshot as a service API that lets us take screenshots using just their API and is also a viable option, especially when we don’t want to deal with the hassle of managing a chrome or other browser instance.

The code is quite straightforward, as we simply need to make a GET request to an endpoint. Let’s do this with the wreq Haskell library.

#!/usr/bin/env stack
-- stack --resolver nightly-2022-10-18 script --package "bytestring wreq lens"
{-# LANGUAGE OverloadedStrings #-}
import Control.Lens
import qualified Data.ByteString as B
import Data.Maybe (fromMaybe)
import Network.Wreq (get, responseBody)
main :: IO ()
main = do
let accessKey = "<your access key>"
let url = "https://apitoolkit.io"
r <- get $ "https://api.screenshotone.com/take?access_key=" <> accessKey <> "&url=" <> url <> "&device_scale_factor=1&format=jpg&cache=false"
B.writeFile "screenshot-api.png" (B.toStrict $ fromMaybe "" $ r ^? responseBody)

A screenshot via API

An API versus Selenium WebDriver and alternatives

I would advise trying Selenium WebDriver for quick prototyping, and if the volumes of screenshots are small, I mean counted by tens.

In addition, API usually has many other problems, like blocking pop-ups, banners, and ads and fixing full-page artifacts.

But if you expect to take screenshots constantly, it is better to examine using existing screenshot API. There is a vast choice nowadays of what to choose. Look at the list of the best screenshot APIs.

Summary

Pick the solution that fits your needs. But it’s helpful to know we have options, especially while building Haskell applications.