# Generating parsing instructions via API

You can generate parsing instruction sets via API by providing URLs and describing what data points you would like to parse. Upon receiving the generated parsing instructions, you may save them as a [parser preset](https://developers.oxylabs.io/scraping-solutions/web-scraper-api/features/custom-parser/parser-presets) or simply send the instructions with your scraping request.

You can also generate parsing instructions via [OxyCopilot](https://developers.oxylabs.io/scraping-solutions/web-scraper-api/web-scraper-api-playground/oxycopilot) on our Web Scraper API Playground.

## Generate instructions from prompt

You can generate parsing instructions by inputting a free-text description of the data points you would like to parse and giving us a few URLs that belong to the same page type. The API will respond with a set of parsing instructions.

* **Endpoint**: `https://data.oxylabs.io/v1/parsers/generate-instructions/prompt`
* **Method**: `POST`
* **Authentication**: `Basic`
* **Request headers**: `Content-Type: application/json`

### Sample payload

```json
{ 
  "prompt_text": "Parse title of the product, main price, developer name and platform name.",
  "urls": [
    "https://sandbox.oxylabs.io/products/1",
    "https://sandbox.oxylabs.io/products/2",
    "https://sandbox.oxylabs.io/products/4"
  ],
  "render": false
}
```

<table><thead><tr><th width="177.1328125">Parameter</th><th>Description</th></tr></thead><tbody><tr><td><mark style="background-color:green;"><strong>prompt_text</strong></mark></td><td>Free-text description of the data points to be parsed.</td></tr><tr><td><mark style="background-color:green;"><strong>urls</strong></mark></td><td>List of URLs exemplifying the page type you would like to get parsing instructions for. We recommend providing 3-5 URLs to help the parser adapt to different layouts and improve parsing accuracy.</td></tr><tr><td><code>render</code></td><td>Whether or not JS rendering should be used to fetch the required content. </td></tr></tbody></table>

&#x20;    \- mandatory parameter

### Sample response

```json
{
    "parsing_instructions": {
        "developer_name": {
            "_fns": [
                {
                    "_args": [
                        "//div[contains(@class, 'brand-wrapper')]//span[@class='brand developer']"
                    ],
                    "_fn": "xpath"
                },
                {
                    "_args": [
                        "normalize-space(.)"
                    ],
                    "_fn": "xpath"
                },
                {
                    "_args": " ",
                    "_fn": "join"
                }
            ]
        },
        "main_price": {
            "_fns": [
                {
                    "_args": [
                        "//div[contains(@class, 'product-info-wrapper')]//div[contains(@class, 'price')]/text()"
                    ],
                    "_fn": "xpath_one"
                },
                {
                    "_fn": "amount_from_string"
                }
            ]
        },
        "title": {
            "_fns": [
                {
                    "_args": [
                        "//div[contains(@class, 'product-info-wrapper')]//h2/text()"
                    ],
                    "_fn": "xpath_one"
                },
                {
                    "_args": [
                        "^\\s*(.[\\s\\S]*?)\\s*$",
                        1
                    ],
                    "_fn": "regex_search"
                }
            ]
        }
    },
    "prompt_schema": {
        "properties": {
            "developer_name": {
                "description": "Developer name.",
                "title": "Developer Name",
                "type": "string"
            },
            "main_price": {
                "description": "Main price of the product.",
                "title": "Main Price",
                "type": "number"
            },
            "platform_name": {
                "description": "Platform name.",
                "title": "Platform Name",
                "type": "string"
            },
            "title": {
                "description": "Title of the product.",
                "title": "Title",
                "type": "string"
            }
        },
        "required": [
            "title",
            "main_price",
            "developer_name",
            "platform_name"
        ],
        "title": "Fields",
        "type": "object"
    }
}
```

## Generate instructions from JSON schema

There are cases where you want to get parsed data in a specific JSON schema. You can use this endpoint to get parsing instructions that stricly adhere to the schema you provide.

* **Endpoint**:  `https://data.oxylabs.io/v1/parsers/generate-instructions/schema`
* **Method**: `POST`
* **Authentication**: `Basic`
* **Request headers**: `Content-Type: application/json`

### Sample payload

```json
{
  "urls": [
    "https://oxylabs.io",
    "https://example.com",
    "https://bbc.co.uk"
  ],
  "prompt_schema": {
    "properties": {
      "links": {
        "description": "An array of URL strings",
        "type": "array",
        "items": {
          "type": "string",
          "description": "A URL"
        }
      }
    },
    "required": [
      "links"
    ]
  },
  "render": false
}
```

<table><thead><tr><th width="177.1328125">Parameter</th><th>Description</th></tr></thead><tbody><tr><td><mark style="background-color:green;"><strong>prompt_schema</strong></mark></td><td><a href="https://json-schema.org/">JSON schema</a> describing the required parser output.</td></tr><tr><td><mark style="background-color:green;"><strong>urls</strong></mark></td><td>List of URLs exemplifying the page type you would like to get parsing instructions for.</td></tr><tr><td><code>render</code></td><td>Whether or not JS rendering should be used to fetch the required content. </td></tr></tbody></table>

&#x20;    \- mandatory parameter

### Sample response

```json
{
    "parsing_instructions": {
            "links": {
                "_fns": [
                    {
                        "_args": [
                            "//a[@href and normalize-space(@href) != '']/@href"
                        ],
                        "_fn": "xpath"
                    }
                ]
            }
        }
}
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://developers.oxylabs.io/scraping-solutions/web-scraper-api/features/custom-parser/generating-parsing-instructions-via-api.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
