Browser instructions (Beta)

When using Headless Browser, you can define your own browser instructions that are executed when rendering JavaScript.

How to use it?

To use browser instructions, provide a set of browser_instructions when creating a job.

Let’s say you want to search for the term pizza boxes in a website.

An example job parameters would look as follows:

curl -k -x https://unblock.oxylabs.io:60000 \
-U 'USERNAME:PASSWORD' \
'https://www.ebay.com' \
-H 'x-oxylabs-render: html' \
-H "x-oxylabs-browser-instructions: [{\"type\":\"input\",\"value\":\"pizza boxes\",\"selector\":{\"type\":\"xpath\",\"value\":\"\/\/input[@class='gh-tb ui-autocomplete-input']\"}},{\"type\":\"click\",\"selector\":{\"type\":\"xpath\",\"value\":\"\/\/input[@type='submit']\"}},{\"type\":\"wait\",\"wait_time_s\":5}]"

Step 1. You must provide the x-oxylabs-render: html parameter.

Step 2. Browser instructions should be described in the x-oxylabs-browser_instructions field.

The browser instructions provided as the header value must be JSON-escaped and contain no extra spaces.

The sample browser instructions above specifies that the aim is to enter a search term pizza boxes into a search field, click search button and wait for 5 seconds for content to load.

The scraped result should look as follows:

<!doctype html><html>
Content after executing the instructions      
</html>

Scraped HTML should look like this:

Fetching browser resources

We provide a standalone browser instruction for fetching browser resources.

The function is defined here:

Using fetch_resource will result in job returning the first occurrence of a Fetch/XHR resource that matches the format provided instead of the HTML that is being targeted.

Let’s say we want to target a GraphQL resource that is fetched when visiting a product page organically in the browser. We will provide job information as such:

curl -k -x https://unblock.oxylabs.io:60000 \
-U 'USERNAME:PASSWORD' \
'https://www.example.com/product-page/123' \
-H 'x-oxylabs-render: html' \
-H "x-oxylabs-browser-instructions: [{\"type\": \"fetch_resource\",\"filter\": \"\/graphql\/product-info\/123\"}]"

These instructions will result in a result as such:

{"product_id": 123, "description": "", "price": 456}

List of supported browser instructions

List of instructions

Status codes

See our response codes outlined here.

Status codes in regards to instructions validation are documented here.

Last updated

Was this helpful?

#867: PMM: WSA layut

Change request updated