Custom Parser
Custom Parser is a free Scraper APIs feature that lets you define your own parsing and data processing logic that is executed on a raw scraping result.
You can use CSS and XPath selectors to select an object in the HTML DOM.
To use Custom Parser, just pass a JSON object with instructions while submitting a job:
If you are using XPath selectors:
{
"source": "universal_ecommerce",
"url": "https://example.com",
"parse": true,
"parsing_instructions": {
"title": {
"_fns": [
{
"_fn": "xpath_one",
"_args": ["//h1/text()"]
}
]
}
}
}
You can conveniently use the
text()
function with XPath, which extracts the text value of the selected node. If you are using CSS selectors, you’ll have to string two functions together: the first one will select the h1 element, while the second one will extract its text:
{
"source": "universal_ecommerce",
"url": "https://example.com",
"parse": true,
"parsing_instructions": {
"title": {
"_fns": [
{"_fn": "css_one", "_args": ["body > div:nth-child(1) > h1"]},
{"_fn": "element_text"}
]
}
}
}
The result will look like this:
{
"results": [
{
"content": {
"title": "Example Domain",
"parse_status_code": 12000
},
"created_at": "2023-03-23 14:47:49",
"updated_at": "2023-03-23 14:47:58",
"page": 1,
"url": "https://example.com",
"job_id": "7044681146457663489",
"status_code": 200
}
]
}
You may retrieve the raw HTML result by adding
?type=raw
to the end of the result retrieval URL. Read more here.Last modified 3h ago