Browser instructions
Last updated
Last updated
When using Headless Browser, you can define your own browser instructions that are executed when rendering JavaScript.
To use browser instructions, provide a set of browser_instructions
when creating a job.
Let’s say you want to search for the term pizza boxes
in a website.
An example job parameters would look as follows:
Step 1. You must provide the "render": "html"
parameter.
Step 2. Browser instructions should be described in the "browser_instructions"
field.
The sample browser instructions above specifies that the aim is to enter a search term pizza boxes
into a search field, click search
button and wait for 5 seconds for content to load.
The scraped result should look as follows:
Scraped HTML should look like this:
We provide a standalone browser instruction for fetching browser resources.
The function is defined here:
Using fetch_resource
will result in job returning the first occurrence of a Fetch/XHR resource that matches the format provided instead of the HTML that is being targeted.
Let’s say we want to target a GraphQL resource that is fetched when visiting a product page organically in the browser. We will provide job information as such:
These instructions will result in a result as such:
See our response codes outlined here.
Status codes in regards to instructions validation are documented here.
If there’s an error or warning resulting from your browsing actions, you’ll find it in the outcome under the keys browser_instructions_error
or browser_instructions_warnings
. For instance, if you’ve sent the following browser instructions and the expected xpath isn’t located on the page, the result will include a warning.
browser_instructions
:
Results:
Unexpected error happened while converting browser instructions to actions.
Unexpected error happened while executing {action.type}
browser instructions.
Action {action.type}
timed out.
Unable to find selector type {selector.type}
with value {selector.value}
on the page.