Web Scraper API features

Learn about all the features available in the Oxylabs Web Scraper API.

Our Web Scraper API comes with freely available features that you can use to scale, speed up, and improve your public data-gathering efforts. Refer to the following list of features and visit their documentation pages for in-depth configuration steps.

OxyCopilot

Develop web scrapers and parsers with an AI-powered feature, OxyCopilot, via the Web Scraper API Playground by simply providing target URLs and writing your needs in plain English. To learn more about how OxyCopilot works and explore ready-to-use prompts, visit the OxyCopilot prompts and code samples library available on our website.

Cloud integration

The cloud integration feature enables you to automatically retrieve job results directly to your Amazon S3, Google Cloud Storage, Alibaba OSS, or other S3-compatible storage. This way, you don’t have to make additional requests to get the data from us.

Batch queries

For efficient scraping operations, Web Scraper API allows you to submit up to 5,000 query or URL parameters per batch. Head to our documentation to learn more.

Headless Browser

With the Headless Browser feature, you can render JavaScript on web pages, manipulate DOM, and execute browser actions like entering text, clicking elements, scrolling, and more.

Custom Parser

When you want to parse the HTML of a web page, but there's no dedicated parser for the target, you can do so with Custom Parser by crafting your own parsing and data processing logic. You may also create, modify, and reuse Parser Presets by hosting them on our system. Self-healing can be enabled to keep your presets working effectively as the target websites change.

Scheduler

For automatic execution of recurring scraping and parsing jobs, you can leverage the Scheduler feature to create schedules. We recommend using this feature together with cloud integration to retrieve data at specified intervals.

Browser instructions

Instead of coding browser instructions manually, you can either use our intuitive step-by-step interface in the Web Scraper API Playground or generate them with AI using a simple natural language prompt. The system then automatically generates the necessary code, which you can download as a structured JSON file for seamless integration into your API requests.

XHR request capturing

Sometimes it is more convenient to extract the required data from one or more of the Fetch/XHR requests that a browser makes while loading the web page, rather than parsing the HTML. Fetch/XHR request capturing is a feature that lets you retrieve these requests as structured JSON data from dynamic content sources.

Markdown output

This feature allows you to request markdown output as an alternative option to HTML or parsed JSON results. These responses provide an easy-to-read format, simplifying result integration into various content workflows and AI tools. Markdown format is especially useful when working with LLMs due to its light weight and clear syntax.

Head back to the dashboard

PreviousHow to use session control NextWeb Scraper API integration methods

Last updated 3 months ago

Was this helpful?

Good afternoon

hashtagOxyCopilot

hashtagCloud integration

hashtagBatch queries

hashtagHeadless Browser

hashtagCustom Parser

hashtagScheduler

hashtagBrowser instructions

hashtagXHR request capturing

hashtagMarkdown output