Support of the OpenAI GPT Vision API in ScreenshotOne

Published on Dmytro Krasun 3 min read
From today, you can use our screenshot API together with the OpenAI Vision API. It allows you to send screenshots directly to the Vision API without building all the infrastructure yourself.

An API example request

Yes, it is as simple as that.

Use cases

Basically, anywhere you used ChatGPT vision for analyzing website screenshots, but now you can automate it with one simple API call.

But let’s consider a few ideas:

  1. You can use it for landing page design and copywriting feedback.
  2. You can use the screenshot API with the GPT vision to convert the website design into HTML and CSS code.
  3. You can use it in complex parsing cases, e.g. to extract emails.

And many more.

Bring your own API key

The Vision API has very strict rate limits and allows only a handful amount of requests. Hence, it was decided to allow customers to bring their own API keys.

Your OpenAI API key is not stored in the permanent history of screenshots and is exempted from logs. But still, make sure to set strict hard limits.

Stability

Currently, the API requests using the Vision API are as stable as the Vision API itself.

We often observe errors and slow performance. In that case, you will still get the screenshot, but not the prompt result.

Known Issues and Limitations

There are a few issues you might encounter:

  1. In case, if screenshot rendering fails, but prompt generation is succeeded. You will be charged by OpenAI, but you are not charged for any failed screenshots by ScreenshotOne.
  2. Use only JPG (JPEG) file formats for performance. They have smaller file sizes and the quality of the Vision API responses looks pretty much the same.
  3. Timeouts are a huge problem. Since the ScreenshotOne API limits the execution time to one minute. It is often hard to generate a screenshot and analyze it with the GPT vision API.

Pricing

Currently, it is available for free and for all plans and all customers. But it might change in the future depending on the usage of the feature.

Alternative

If you don’t see it makes sense for you to use the native screenshot API integration with the GPT vision API, e.g. if you want to make sure your OpenAI key is not shared anywhere, you can still use ScreenshotOne but will need to perform an additional step.

  1. Render a screenshot and get a cache URL to it or upload it to any compatible S3 storage.
  2. Once uploaded, pass the link to the Vision API.

Or alternatively:

  1. You can get the binary response encoded with base64 encoding.
  2. And then send the encoded image to the OpenAI directly.

Choose whatever approach suits you better. Using direct/native integration is just a nice option you might use. You decide eventually.

Summary

If you have built something interesting using ScreenshotOne, please, share. Send an email to support@screenshotone.com.

We will consider featuring your product on our site and social media accounts.

In case you have any problems or questions related to the API reach out through the same email—support@screenshotone.com.

And we will get back to you as soon as possible.