# Getting started

## How to use Custom Parser <a href="#how-to-use-custom-parser" id="how-to-use-custom-parser"></a>

### Scenario example

Say you want to parse the **number of total results** Bing Search yields with a search term `test`:

<figure><img src="https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2FPrxIqONdWJuGVRNOZnTk%2Fcustom_parser_bing.png?alt=media&#x26;token=0a9be806-d7f5-40ae-b80c-0f11871f9996" alt=""><figcaption></figcaption></figure>

We'll overview the three main methods to achieve this goal:

* [Generate parsers with OxyCopilot](#generate-parsers-with-oxycopilot)
* [Generate parsers via API](#generate-parsers-via-api)
* [Write parsing instructions manually](#write-instructions-manually)

### Generate parsers with OxyCopilot

OxyCopilot lets you describe your needs in plain English to **automatically create scrapers and parsers** for a website. Learn the basics by following the steps outlined below and check out [OxyCopilot documentation](https://developers.oxylabs.io/scraping-solutions/web-scraper-api-playground/oxycopilot#custom-parser-builder) for more information.

{% hint style="success" %}
Open the [**Web Scraper API Playground**](https://dashboard.oxylabs.io/en/api-playground) on our dashboard to access OxyCopilot.
{% endhint %}

{% stepper %}
{% step %}

#### Enter the URL(s)

Click the **OxyCopilot button** at the top-left side and enter up to 3 URLs of the same page type. Let's use this Bing Search URL: `https://www.bing.com/search?q=test`.

<figure><img src="https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2FmJqGOdCEMSZRc8qRwIfk%2FCS-OxyCopilot-1.png?alt=media&#x26;token=860755f5-84a7-4e7a-8354-160e94b192aa" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
You can also configure the scraper manually by filling in the **Website**, **Scraper**, and **URL** fields at the top, and adjusting **additional parameters** like JavaScript rendering in the left-side menu.
{% endhint %}
{% endstep %}

{% step %}

#### Set up scraper parameters

Next, specify scraper parameters, browser instructions, and enable JavaScript rendering if your target website requires that.

For Bing Search, **enable JavaScript rendering** and then click **Next**.

<figure><img src="https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2FVi2jUZZp7kD3dAnGwwCR%2FCS-OxyCopilot-2.png?alt=media&#x26;token=4e793d18-f3ea-4ea5-b612-09596eee61c5" alt="" width="375"><figcaption></figcaption></figure>
{% endstep %}

{% step %}

#### Write the prompt

Explain the data you want to extract from a page. Make sure to be descriptive and provide the most important information. You can find prompt examples for popular websites in our [OxyCopilot prompts library](https://oxylabs.io/resources/prompts-code-samples).

Paste the following prompt to extract the total number of results from Bing Search pages:

```
Parse the number of total search results.
```

<figure><img src="https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2FLMI159k2LpJPzpfJ9LK3%2FCS-OxyCopilot-3.png?alt=media&#x26;token=795ae14c-e276-4362-9158-6660c965d44a" alt="" width="563"><figcaption></figcaption></figure>

Click the **Generate instructions** button to send your prompt.
{% endstep %}

{% step %}

#### Review parsed data and instructions

Once OxyCopilot finishes, you'll see the following window where parsed data is on the right-side:

<figure><img src="https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2Femo5HJUYdI6yNkecQiZP%2FCS-OxyCopilot-4.png?alt=media&#x26;token=e0e12895-5a88-4c91-a0bc-0e9d88b30270" alt=""><figcaption></figcaption></figure>

If you want to make any adjustments, you can do so here. Modify the URL(s), refine the prompt, enable Javascript rendering, or [edit the parsing schema](https://developers.oxylabs.io/scraping-solutions/web-scraper-api-playground/oxycopilot#step-2-optional-adjust-parsing-schema) to suit your needs. When you update any fields in this window, you can rerun the request by selecting **Start new request**.

You may also **view and directly edit the parsing instructions** here:

<figure><img src="https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2FylhQnnMEzPbCX0jqxLWA%2FCS-OxyCopilot-5.png?alt=media&#x26;token=3be22fad-112e-4835-b360-787c36362c5a" alt="" width="373"><figcaption></figcaption></figure>

Once you're happy with the result, **Load instructions** to continue.
{% endstep %}

{% step %}

#### Save the parser as a preset

You can easily save your parsing instructions as a [parser preset](https://developers.oxylabs.io/scraping-solutions/web-scraper-api/features/custom-parser/parser-presets). This lets you reuse the preset in OxyCopilot and with your API requests.&#x20;

In the Web Scraper API Playground, you can optionally choose the user for which to save the preset. Once you're all set, simply click **Save**:

<figure><img src="https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2F7OOEZBc2SMe4AeLv8uZQ%2FCS-OxyCopilot-6.png?alt=media&#x26;token=2a1bf65b-9660-415e-9bb2-01a60962b1cb" alt=""><figcaption></figcaption></figure>

A pop-up will appear prompting you to name the preset and add an optional description:

<figure><img src="https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2FrvVBSfKFMvVkEaoub5Sv%2FCS-OxyCopilot-7.png?alt=media&#x26;token=246ff512-7e3d-4b06-bf0b-7cc41e7cb42c" alt="" width="375"><figcaption></figcaption></figure>
{% endstep %}

{% step %}

#### Use the preset with API requests

To use a preset with your Web Scraper API requests, set `parse` to `true` and specify the preset name with the `parser_preset` parameter.

**Endpoint:** `POST https://data.oxylabs.io/v1/queries`

```json
{
    "source": "bing_search",
    "query": "test",
    "render": "html",
    "parse": true,
    "parser_preset": "Bing_total_results"
}
```

Running the request will provide the following JSON output:

```json
{
    "results": [
        {
            "content": {
                "parse_status_code": 12000,
                "total_search_results": 12000000
            },
            "created_at": "2025-10-24 09:29:28",
            "updated_at": "2025-10-24 09:30:42",
            "page": 1,
            "url": "https://www.bing.com/search?q=test",
            "job_id": "7387419953164488705",
            "is_render_forced": false,
            "status_code": 200,
            "type": "parsed",
            "parser_type": "preset",
            "parser_preset": "Bing_total_results"
        }
    ]
}
```

{% endstep %}
{% endstepper %}

## Advanced usage

### Generate parsers via API

Instead of using OxyCopilot in the playground, you can send prompts directly to Web Scraper API and generate parsers. See the [Generating parsing instructions via API](https://developers.oxylabs.io/scraping-solutions/web-scraper-api/features/custom-parser/generating-parsing-instructions-via-api) documentation page to learn more.

{% hint style="success" %}
We recommend **providing 3-5 URLs of the same type** (e.g., product pages). This helps the parser adapt to different layouts and improves parsing accuracy.
{% endhint %}

**Endpoint:** `POST https://data.oxylabs.io/v1/parsers/generate-instructions/prompt`

```json
{
    "prompt_text": "Parse the number of total search results.",
    "urls": ["https://www.bing.com/search?q=test"],
    "render": true
}
```

<details>

<summary>Output</summary>

```json
{
    "parsing_instructions": {
        "total_search_results": {
            "_fns": [
                {
                    "_args": [
                        "//span[contains(@class, 'count')]/text()"
                    ],
                    "_fn": "xpath_one"
                },
                {
                    "_fn": "amount_from_string"
                }
            ]
        }
    },
    "prompt_schema": {
        "properties": {
            "total_search_results": {
                "description": "The number of total search results.",
                "title": "Total Search Results",
                "type": "number"
            }
        },
        "required": [
            "total_search_results"
        ],
        "title": "Fields",
        "type": "object"
    }
}
```

</details>

### Save parser presets via API

Web Scraper API lets you save parsing instructions as reusable parser presets. Check out the [Parser Presets](https://developers.oxylabs.io/scraping-solutions/web-scraper-api/features/custom-parser/parser-presets) documentation to find a list of available actions and comprehensive code samples.

**Endpoint:** `POST https://data.oxylabs.io/v1/parsers/presets`

```json
{
    "name": "Bing_total_results",
    "parsing_instructions": {
        "total_search_results": {
            "_fns": [
                {
                    "_fn": "xpath_one",
                    "_args": [
                        "//span[contains(@class, 'count')]/text()"
                    ]
                },
                {
                    "_fn": "amount_from_string"
                }
            ]
        }
    }
}
```

<details>

<summary>Output</summary>

```json
{
    "id": 421938,
    "name": "Bing_total_results",
    "description": null,
    "prompt_text": null,
    "prompt_schema": null,
    "urls": [],
    "render": false,
    "parsing_instructions": {
        "total_search_results": {
            "_fns": [
                {
                    "_args": [
                        "//span[contains(@class, 'count')]/text()"
                    ],
                    "_fn": "xpath_one"
                },
                {
                    "_fn": "amount_from_string"
                }
            ]
        }
    },
    "self_heal": false,
    "heal_status": "disabled",
    "last_healed_at": null,
    "created_at": "2025-10-27 09:28:37",
    "updated_at": "2025-10-27 09:28:37"
}
```

</details>

### Write instructions manually

To use Custom Parser manually, include a set of `parsing_instructions` when creating a job. You can use **CSS and XPath selectors** to target elements in the DOM.

Follow the step-by-step example below to learn the basics, then explore our in-depth guide on [writing instructions manually](https://developers.oxylabs.io/scraping-solutions/web-scraper-api/features/custom-parser/writing-instructions-manually) for advanced techniques and detailed documentation.

Let's take the Bing Search scenario as an example. The job parameters would look as follows:

```json
{
    "source": "bing_search",
    "query": "test",
    "render": "html",
    "parse": true,
    "parsing_instructions": {
        "number_of_results": {
            "_fns": [
                {
                    "_fn": "xpath_one",
                    "_args": [".//span[@class='sb_count']/text()"]
                }
            ]
        }
    }
}
```

**Step 1.** You must provide the `"parse": true` parameter.

**Step 2.** Parsing instructions must be described in the `"parsing_instructions"` field.

The sample parsing instructions above specifies that the aim is to parse the number of search results from the scraped document and put the result in the `number_of_results` field. The instructions on how to parse the field by defining a “pipeline” is given as:

```json
"_fns": [
    {
        "_fn": "xpath_one",
        "_args": [".//span[@class='sb_count']/text()"]
    }
]
```

The pipeline describes a list of data processing functions to be executed. The functions are executed in the order they appear on the list and take the output of the previous function as the input.&#x20;

In the sample pipeline above, the `xpath_one` function ([**full list of available functions**](https://developers.oxylabs.io/scraping-solutions/web-scraper-api/features/custom-parser/writing-instructions-manually/list-of-functions)) is used. It allows you to process an HTML document using XPath expressions and XSLT functions. As a function argument, specify the exact path where the target element can be found: `.//span[@class='sb_count']`. You can also instruct the parser to select the `text()` found in the target element.

The parsed result of the sample job above should look like this:

```json
{
    "results": [
        {
            "content": {
                "number_of_results": "About 16,700,000 results",
                "parse_status_code": 12000
            },
            "created_at": "2025-10-27 09:48:04",
            "updated_at": "2025-10-27 09:48:38",
            "page": 1,
            "url": "https://www.bing.com/search?q=test",
            "job_id": "7388511797231226881",
            "is_render_forced": false,
            "status_code": 200,
            "type": "parsed",
            "parser_type": "custom",
            "parser_preset": null
        }
    ]
}
```

Custom Parser not only offers text extraction from a scraped HTML, but it can also execute basic data processing functions.&#x20;

For example, the previously described parsing instructions extract `number_of_results` as a text with extra keywords you may not necessarily need. If you want to get the number of results for the given `query=test` in the numeric data type, you can reuse the same parsing instructions and add the `amount_from_string` function to the existing pipeline:

```json
{
    "source": "bing_search",
    "query": "test",
    "render": "html",
    "parse": true,
    "parsing_instructions": {
        "number_of_results": {
            "_fns": [
                {
                    "_fn": "xpath_one",
                    "_args": [".//span[@class='sb_count']/text()"]
                },
                {
                    "_fn": "amount_from_string"
                }
            ]
        }
    }
}
```

The parsed result of the sample job above should look like this:

```json
{
    "results": [
        {
            "content": {
                "number_of_results": 14200000,
                "parse_status_code": 12000
            },
            "created_at": "2025-10-27 10:00:36",
            "updated_at": "2025-10-27 10:01:05",
            "page": 1,
            "url": "https://www.bing.com/search?q=test",
            "job_id": "7388514950961963009",
            "is_render_forced": false,
            "status_code": 200,
            "type": "parsed",
            "parser_type": "custom",
            "parser_preset": null
        }
    ]
}
```

## What happens if parsing fails when using Custom Parser <a href="#what-happens-if-parsing-fails-when-using-custom-parser" id="what-happens-if-parsing-fails-when-using-custom-parser"></a>

If Custom Parser fails to process client-defined parsing instructions, we will return the `12005` status code (parsed with warnings).

```json
{
    "source": "bing_search",
    "query": "test",
    "render": "html",
    "parse": true,
    "parsing_instructions": {
        "number_of_results": {
            "_fns": [
                {
                    "_fn": "xpath_one",
                    "_args": [".//span[@class='sb_count']/text()"]
                },
                {
                    "_fn": "amount_from_string"
                }
            ]
        },
        "number_of_organics": {
            "_fns": [
                {
                    "_fn": "xpath",
                    "_args": ["//this-will-not-match-anything"]
                },
                {
                    "_fn": "length"
                }
            ]
        }
    }
}
```

You will be charged for such results:

```json
{
    "results": [
        {
            "content": {
                "_warnings": [
                    {
                        "_fn": "xpath",
                        "_msg": "XPath expressions did not match any data.",
                        "_path": ".number_of_organics",
                        "_fn_idx": 0
                    }
                ],
                "number_of_results": 14200000,
                "parse_status_code": 12005,
                "number_of_organics": null
            },
            "created_at": "2025-10-27 10:03:54",
            "updated_at": "2025-10-27 10:04:22",
            "page": 1,
            "url": "https://www.bing.com/search?q=test",
            "job_id": "7388515782126234625",
            "is_render_forced": false,
            "status_code": 200,
            "type": "parsed",
            "parser_type": "custom",
            "parser_preset": null
        }
    ]
}
```

If Custom Parser encounters an exception and breaks during the parsing operation, it can return these status codes: `12002`, `12006`, `12007`. You will not be charged for these unexpected errors.

## Status codes <a href="#status-codes" id="status-codes"></a>

See our status codes outlined [**here**](https://developers.oxylabs.io/scraping-solutions/response-codes#parsers).
