Documentation has been updated: see help center and changelog in one place.

Getting started

See how to start using Oxylabs Custom Parser. On this page, you'll find comprehensive examples, tips, and details on what happens if parsing fails.

How to use Custom Parser

Scenario example

Say you want to parse the number of total results Bing Search yields with a search term test:

We'll overview the three main methods to achieve this goal:

Generate parsers with OxyCopilot

OxyCopilot lets you describe your needs in plain English to automatically create scrapers and parsers for a website. Learn the basics by following the steps outlined below and check out OxyCopilot documentation for more information.

1

Enter the URL(s)

Click the OxyCopilot button at the top-left side and enter up to 3 URLs of the same page type. Let's use this Bing Search URL: https://www.bing.com/search?q=test.

You can also configure the scraper manually by filling in the Website, Scraper, and URL fields at the top, and adjusting additional parameters like JavaScript rendering in the left-side menu.

2

Set up scraper parameters

Next, specify scraper parameters, browser instructions, and enable JavaScript rendering if your target website requires that.

For Bing Search, enable JavaScript rendering and then click Next.

3

Write the prompt

Explain the data you want to extract from a page. Make sure to be descriptive and provide the most important information. You can find prompt examples for popular websites in our OxyCopilot prompts library.

Paste the following prompt to extract the total number of results from Bing Search pages:

Parse the number of total search results.

Click the Generate instructions button to send your prompt.

4

Review parsed data and instructions

Once OxyCopilot finishes, you'll see the following window where parsed data is on the right-side:

If you want to make any adjustments, you can do so here. Modify the URL(s), refine the prompt, enable Javascript rendering, or edit the parsing schema to suit your needs. When you update any fields in this window, you can rerun the request by selecting Start new request.

You may also view and directly edit the parsing instructions here:

Once you're happy with the result, Load instructions to continue.

5

Save the parser as a preset

You can easily save your parsing instructions as a parser preset. This lets you reuse the preset in OxyCopilot and with your API requests.

In the Web Scraper API Playground, you can optionally choose the user for which to save the preset. Once you're all set, simply click Save:

A pop-up will appear prompting you to name the preset and add an optional description:

6

Use the preset with API requests

To use a preset with your Web Scraper API requests, set parse to true and specify the preset name with the parser_preset parameter.

Endpoint: POST https://data.oxylabs.io/v1/queries

{
    "source": "bing_search",
    "query": "test",
    "render": "html",
    "parse": true,
    "parser_preset": "Bing_total_results"
}

Running the request will provide the following JSON output:

{
    "results": [
        {
            "content": {
                "parse_status_code": 12000,
                "total_search_results": 12000000
            },
            "created_at": "2025-10-24 09:29:28",
            "updated_at": "2025-10-24 09:30:42",
            "page": 1,
            "url": "https://www.bing.com/search?q=test",
            "job_id": "7387419953164488705",
            "is_render_forced": false,
            "status_code": 200,
            "type": "parsed",
            "parser_type": "preset",
            "parser_preset": "Bing_total_results"
        }
    ]
}

Advanced usage

Generate parsers via API

Instead of using OxyCopilot in the playground, you can send prompts directly to Web Scraper API and generate parsers this way. See the Generating parsing instructions via API documentation page to learn more.

Endpoint: POST https://data.oxylabs.io/v1/parsers/generate-instructions/prompt

{
    "prompt_text": "Parse the number of total search results.",
    "urls": ["https://www.bing.com/search?q=test"],
    "render": true
}
Output
{
    "parsing_instructions": {
        "total_search_results": {
            "_fns": [
                {
                    "_args": [
                        "//span[contains(@class, 'count')]/text()"
                    ],
                    "_fn": "xpath_one"
                },
                {
                    "_fn": "amount_from_string"
                }
            ]
        }
    },
    "prompt_schema": {
        "properties": {
            "total_search_results": {
                "description": "The number of total search results.",
                "title": "Total Search Results",
                "type": "number"
            }
        },
        "required": [
            "total_search_results"
        ],
        "title": "Fields",
        "type": "object"
    }
}

Save parser presets via API

Web Scraper API lets you save parsing instructions as reusable parser presets. Check out the Parser Presets documentation to find a list of available actions and comprehensive code samples.

Endpoint: POST https://data.oxylabs.io/v1/parsers/presets

{
    "name": "Bing_total_results",
    "parsing_instructions": {
        "total_search_results": {
            "_fns": [
                {
                    "_fn": "xpath_one",
                    "_args": [
                        "//span[contains(@class, 'count')]/text()"
                    ]
                },
                {
                    "_fn": "amount_from_string"
                }
            ]
        }
    }
}
Output
{
    "id": 421938,
    "name": "Bing_total_results",
    "description": null,
    "prompt_text": null,
    "prompt_schema": null,
    "urls": [],
    "render": false,
    "parsing_instructions": {
        "total_search_results": {
            "_fns": [
                {
                    "_args": [
                        "//span[contains(@class, 'count')]/text()"
                    ],
                    "_fn": "xpath_one"
                },
                {
                    "_fn": "amount_from_string"
                }
            ]
        }
    },
    "self_heal": false,
    "heal_status": "disabled",
    "last_healed_at": null,
    "created_at": "2025-10-27 09:28:37",
    "updated_at": "2025-10-27 09:28:37"
}

Write instructions manually

To use Custom Parser manually, include a set of parsing_instructions when creating a job. You can use CSS and XPath selectors to target elements in the DOM.

Follow the step-by-step example below to learn the basics, then explore our in-depth guide on writing instructions manually for advanced techniques and detailed documentation.

Let's take the Bing Search scenario as an example. The job parameters would look as follows:

{
    "source": "bing_search",
    "query": "test",
    "render": "html",
    "parse": true,
    "parsing_instructions": {
        "number_of_results": {
            "_fns": [
                {
                    "_fn": "xpath_one",
                    "_args": [".//span[@class='sb_count']/text()"]
                }
            ]
        }
    }
}

Step 1. You must provide the "parse": true parameter.

Step 2. Parsing instructions must be described in the "parsing_instructions" field.

The sample parsing instructions above specifies that the aim is to parse the number of search results from the scraped document and put the result in the number_of_results field. The instructions on how to parse the field by defining a “pipeline” is given as:

"_fns": [
    {
        "_fn": "xpath_one",
        "_args": [".//span[@class='sb_count']/text()"]
    }
]

The pipeline describes a list of data processing functions to be executed. The functions are executed in the order they appear on the list and take the output of the previous function as the input.

In the sample pipeline above, the xpath_one function (full list of available functions) is used. It allows you to process an HTML document using XPath expressions and XSLT functions. As a function argument, specify the exact path where the target element can be found: .//span[@class='sb_count']. You can also instruct the parser to select the text() found in the target element.

The parsed result of the sample job above should look like this:

{
    "results": [
        {
            "content": {
                "number_of_results": "About 16,700,000 results",
                "parse_status_code": 12000
            },
            "created_at": "2025-10-27 09:48:04",
            "updated_at": "2025-10-27 09:48:38",
            "page": 1,
            "url": "https://www.bing.com/search?q=test",
            "job_id": "7388511797231226881",
            "is_render_forced": false,
            "status_code": 200,
            "type": "parsed",
            "parser_type": "custom",
            "parser_preset": null
        }
    ]
}

Custom Parser not only offers text extraction from a scraped HTML, but it can also execute basic data processing functions.

For example, the previously described parsing instructions extract number_of_results as a text with extra keywords you may not necessarily need. If you want to get the number of results for the given query=test in the numeric data type, you can reuse the same parsing instructions and add the amount_from_string function to the existing pipeline:

{
    "source": "bing_search",
    "query": "test",
    "render": "html",
    "parse": true,
    "parsing_instructions": {
        "number_of_results": {
            "_fns": [
                {
                    "_fn": "xpath_one",
                    "_args": [".//span[@class='sb_count']/text()"]
                },
                {
                    "_fn": "amount_from_string"
                }
            ]
        }
    }
}

The parsed result of the sample job above should look like this:

{
    "results": [
        {
            "content": {
                "number_of_results": 14200000,
                "parse_status_code": 12000
            },
            "created_at": "2025-10-27 10:00:36",
            "updated_at": "2025-10-27 10:01:05",
            "page": 1,
            "url": "https://www.bing.com/search?q=test",
            "job_id": "7388514950961963009",
            "is_render_forced": false,
            "status_code": 200,
            "type": "parsed",
            "parser_type": "custom",
            "parser_preset": null
        }
    ]
}

What happens if parsing fails when using Custom Parser

If Custom Parser fails to process client-defined parsing instructions, we will return the 12005 status code (parsed with warnings).

{
    "source": "bing_search",
    "query": "test",
    "render": "html",
    "parse": true,
    "parsing_instructions": {
        "number_of_results": {
            "_fns": [
                {
                    "_fn": "xpath_one",
                    "_args": [".//span[@class='sb_count']/text()"]
                },
                {
                    "_fn": "amount_from_string"
                }
            ]
        },
        "number_of_organics": {
            "_fns": [
                {
                    "_fn": "xpath",
                    "_args": ["//this-will-not-match-anything"]
                },
                {
                    "_fn": "length"
                }
            ]
        }
    }
}

You will be charged for such results:

{
    "results": [
        {
            "content": {
                "_warnings": [
                    {
                        "_fn": "xpath",
                        "_msg": "XPath expressions did not match any data.",
                        "_path": ".number_of_organics",
                        "_fn_idx": 0
                    }
                ],
                "number_of_results": 14200000,
                "parse_status_code": 12005,
                "number_of_organics": null
            },
            "created_at": "2025-10-27 10:03:54",
            "updated_at": "2025-10-27 10:04:22",
            "page": 1,
            "url": "https://www.bing.com/search?q=test",
            "job_id": "7388515782126234625",
            "is_render_forced": false,
            "status_code": 200,
            "type": "parsed",
            "parser_type": "custom",
            "parser_preset": null
        }
    ]
}

If Custom Parser encounters an exception and breaks during the parsing operation, it can return these status codes: 12002, 12006, 12007. You will not be charged for these unexpected errors.

Status codes

See our status codes outlined here.

Last updated

Was this helpful?