# Parsing function examples

## HTML processing

### `element_text`

#### Sample HTML

```html
<!DOCTYPE html>
<html>
<body>
    <div id="product">
        <div id="product-description">This is a nice product</div>
        <div id="product-price">    12  3


        </div>
    </div>
</body>
</html>
```

**Extract text from HTML element and strip whitespaces**

```json
{
    "price": {
        "_fns": [
            {
                "_fn": "xpath_one",
                "_args": [".//*[@id='product-price']"]
            },
            {
                "_fn": "element_text"
            }
        ]
    }
}
```

```json
{
    "price": "12  3"
}
```

**Given a string value as an input, do nothing**

```json
{
    "price": {
        "_fns": [
            {
                "_fn": "xpath_one",
                "_args": [".//*[@id='product-price']/text()"]
            },
            {
                "_fn": "element_text"
            }
        ]
    }
}
```

```json
{
    "price": "    12  3\n\n\n        "
}
```

### `xpath`

#### Sample HTML

```html
<body>
    <div class="product" id="socks">
        <div class="title">Socks</div>
        <div class="price">123.12</div>
        <div class="description">
            <ul>
                <li class="description-item">Very</li>
                <li class="description-item">Nice</li>
                <li class="description-item">Socks</li>
            </ul>
        </div>
    </div>
</body>
```

**Get all description items**

```json
{
    "description_items": {
        "_fns": [
            {
                "_fn": "xpath",
                "_args": ["//li[@class='description-item']/text()"]
            }
        ]
    }
}
```

```json
{
    "description_items": ["Very", "Nice", "Socks"]
}
```

**Get the first description item**

```json
{
    "first_description_item": {
        "_fns": [
            {
                "_fn": "xpath",
                "_args": ["(//li[@class='description-item'])[1]/text()"]
            }
        ]
    }
}
```

```json
{
    "first_description_item": [
        "Very"
    ]
}
```

**Check if the description section element exists**

```json
{
    "description_section_exists": {
        "_fns": [
            {
                "_fn": "xpath",
                "_args": ["boolean(//div[@class='description'])"]
            }
        ]
    }
}
```

```json
{
    "description_section_exists": true
}
```

**Get price as a number**

```json
{
    "price": {
        "_fns": [
            {
                "_fn": "xpath",
                "_args": ["number(//div[@class='price'])"]
            }
        ]
    }
}
```

```json
{
    "description_section_exists": 123.12
}
```

**Multiple expressions to fallback to in case preceding expression fails**

```json
{
    "price": {
        "_fns": [
            {
                "_fn": "xpath",
                "_args": [
                    "//div[@class='product-price']/text()", <--- this does not find anything
                    "//div[@class='price']/text()" <--- this finds the target price
                ]
            }
        ]
    }
}
```

```json
{
    "price": [
        "123.12"
    ]
}
```

**XPath `|` operator to match with multiple expressions**

```json
{
    "price_and_title": {
        "_fns": [
            {
                "_fn": "xpath",
                "_args": ["//div[@class='price']/text() | //div[@class='title']/text()"]
            }
        ]
    }
}
```

```json
{
    "price_and_title": [
        "Socks",
        "123.12"
    ]
}
```

### `xpath_one`

#### Sample HTML

```html
<body>
    <div class="product" id="socks">
        <div class="title">Socks</div>
        <div class="price">123.12</div>
        <div class="description">
            <ul>
                <li class="description-item">Very</li>
                <li class="description-item">Nice</li>
                <li class="description-item">Socks</li>
            </ul>
        </div>
    </div>
</body>
```

**Return the first match**

```json
{
    "first_description_item": {
        "_fns": [
            {
                "_fn": "xpath_one",
                "_args": [".//li/text()"]
            }
        ]
    }
}
```

```json
{
    "first_description_item": "Very"
}
```

**Using XSLT functions**

```json
{
    "price": {
        "_fns": [
            {
                "_fn": "xpath_one",
                "_args": ["number(.//div[@class='price'])"]
            }
        ]
    }
}
```

```json
{
    "price": 123.12
}
```

## String manipulation

### `amount_from_string`

#### Sample HTML

```html
<body>
    <div class="product" id="socks">
        <div class="title">Socks</div>
        <div class="price">The price is: 123.12 pesos</div>
    </div>
</body>
```

**Extract amount from string**

```json
{
    "price": {
        "_fns": [
            {
                "_fn": "xpath_one",
                "_args": [".//div[@class='price']/text()"]
            },
            {
                "_fn": "amount_from_string"
            }
        ]
    }
}
```

```json
{
    "price": 123.12
}
```

### `amount_range_from_string`

#### Sample HTML

```html
<body>
    <div class="product">
        <div class="price">
            The price is: 123.12 pesos;
            The price is: 345.12 pesos;
            The price is: 678.12 pesos
        </div>
    </div>
</body>
```

**Extract all amounts from string**

```json
{
    "prices": {
        "_fns": [
            {
                "_fn": "xpath_one",
                "_args": [".//div[@class='price']/text()"]
            },
            {
                "_fn": "amount_range_from_string"
            }
        ]
    }
}
```

```json
{    
    "prices": [
        123.12,
        345.12,
        678.12
    ]
}
```

### `join`

#### Sample HTML

```html
<body>
    <div class="product">
        <div class="price">
            The price is: 123.12 pesos;
        </div>
        <div class="price">
            The price is: 345.12 pesos;
        </div>
        <div class="price">
            The price is: 678.12 pesos
        </div>
    </div>
</body>
```

**Join an array of strings into a single string**

```json
{
    "price_variants": {
        "_fns": [
            {
                "_fn": "xpath",
                "_args": [".//div[@class='price']"]
            },
            {  // If we call normalize-space() in first pipeline function, 
               // it will return only the first value.
                "_fn": "xpath",
                "_args": ["normalize-space(text())"]
            },  
            {
                "_fn": "join",
                "_args": ""
            }
        ]
    }
}
```

```json
{
    "price_variants": "The price is: 123.12 pesos;The price is: 345.12 pesos;The price is: 678.12 pesos"
}
```

### `regex_find_all` <a href="#regex_find_all" id="regex_find_all"></a>

#### Sample HTML

```html
<body>
    <div class="product">
        <div class="description">
            [one description]
            [two description]
            [three description]
        </div>
    </div>
</body>
```

**Find all matches between two characters**

```json
{
    "descriptions": {
        "_fns": [
            {
                "_fn": "xpath_one",
                "_args": [".//div[@class='description']/text()"]
            },
            {
                "_fn": "regex_find_all",
                "_args": ["\\[(.*)\\]"]
            }
        ]
    }
}
```

```json
{
    "descriptions": [
        "one description",
        "two description",
        "three description"
    ]
}
```

### `regex_search` <a href="#regex_search" id="regex_search"></a>

#### Sample HTML

```html
<body>
    <div class="product">
        <div class="description">
            [one description]
            [two description]
            [three description]
            {the one i need}
        </div>
    </div>
</body>
```

**Return description between two characters**

```json
{
    "description": {
        "_fns": [
            {
                "_fn": "xpath_one",
                "_args": [".//div[@class='description']/text()"]
            },
            {
                "_fn": "regex_search",
                "_args": ["{(.*)}", 1]
            }
        ]
    }
}
```

```json
{
    "description": "the one i need"
}
```

### `regex_substring`

#### Sample HTML

```html
<body>
    <div class="product">
        <div class="description">
            * one description
            * two description
            * three description
            * {this one i would like to get replaced}
        </div>
    </div>
</body>
```

**Replace a part of text with specified value**

```json
{
    "descriptions": {
        "_fns": [
            {
                "_fn": "xpath_one",
                "_args": [".//div[@class='description']/text()"]
            },
            {
                "_fn": "regex_substring",
                "_args": ["{this one i would like to get replaced}", "four description"]
            },
            {
                "_fn": "regex_find_all",
                "_args": ["\\*\\s(.*)\n"]
            }
        ]
    }
}
```

```json
{
    "descriptions": [
        "one description",
        "two description",
        "three description",
        "four description"
    ]
}
```

## Common functions

### `convert_to_*`

#### Sample HTML

```html
<body>
    <div class="product">
        <div class="price">123</div>
        <div class="price">124</div>
        <div class="price">456</div>
        <div class="price">421</div>
        <div class="price">100</div>
    </div>
</body>
```

**Get the count of price variants**

```json
{
    "price_variants": {
        "_fns": [
            {
                "_fn": "xpath",
                "_args": [".//div[@class='price']"]
            },
            {
                "_fn": "length"
            }
        ]
    }
}
```

```json
{
    "price_variants": 5
}
```

**Get the count of price variants in a multi-dimensional array**

Sample HTML:

```html
<body>
    <div class="product">
        <property class="colors">
            <option class="color">Red</option>
            <option class="color">Green</option>
            <option class="color">Blue</option>
        </property>
        <property class="sizes">
            <option class="size">S</option>
            <option class="size">M</option>
            <option class="size">L</option>
            <option class="size">XL</option>
        </property>
    </div>
</body>
```

```json
{
    "number_of_variants": {
        "_fns": [
            {
                "_fn": "xpath",
                "_args": [".//property"]
            },
            {
                "_fn": "xpath",
                "_args": [".//option"]
            },
            {
                "_fn": "length"
            }
        ]
    }
}
```

```json
{
    "number_of_variants": [
        3,
        3
    ]
}
```

### `select_nth`

#### Sample HTML

```html
<body>
    <div class="product" id="socks">
        <div class="title">Socks</div>
        <div class="price">123.12</div>
        <div class="description">
            <ul>
                <li class="description-item">Very</li>
                <li class="description-item">Nice</li>
                <li class="description-item">Socks</li>
            </ul>
        </div>
    </div>
</body>
```

**Select the first description item from the array**

```json
{
    "price_and_title": {
        "_fns": [
            {
                "_fn": "xpath",
                "_args": ["//li[@class='description-item']/text()"]
            },
            {
                "_fn": "select_nth",
                "_args": 0
            }
        ]
    }
}
```

```json
{
    "price_and_title": "Very"
}
```

**Select the last description item from the array**

```json
{
    "price_and_title": {
        "_fns": [
            {
                "_fn": "xpath",
                "_args": ["//li[@class='description-item']/text()"]
            },
            {
                "_fn": "select_nth",
                "_args": -1
            }
        ]
    }
}
```

```json
{
    "price_and_title": "Socks"
}
```

## Math functions

### `average`

#### Sample HTML

```html
<body>
    <div class="product">
        <div class="price">123</div>
        <div class="price">124</div>
        <div class="price">456</div>
        <div class="price">421</div>
        <div class="price">100</div>
    </div>
</body>
```

**Find the average of all listed prices**

```json
{
    "price_average": {
        "_fns": [
            {
                "_fn": "xpath",
                "_args": [".//div[@class='price']"]
            },
            {
                "_fn": "xpath_one",
                "_args": ["number(text())"]
            },
            {
                "_fn": "average"
            }
        ]
    }
}
```

```json
{
    "price_average": 244.8
}
```

### `max`

#### Sample HTML

```html
<body>
    <div class="product">
        <div class="price">123</div>
        <div class="price">124</div>
        <div class="price">456</div>
        <div class="price">421</div>
        <div class="price">100</div>
    </div>
</body>
```

**Find the max of all listed prices**

```json
{
    "price_max": {
        "_fns": [
            {
                "_fn": "xpath",
                "_args": [".//div[@class='price']"]
            },
            {
                "_fn": "xpath_one",
                "_args": ["number(text())"]
            },
            {
                "_fn": "max"
            }
        ]
    }
}
```

```json
{
    "price_max": 456.0
}
```

### `min`

#### Sample HTML

```html
<body>
    <div class="product">
        <div class="price">123</div>
        <div class="price">124</div>
        <div class="price">456</div>
        <div class="price">421</div>
        <div class="price">100</div>
    </div>
</body>
```

**Find the average of all listed prices**

```json
{
    "price_min": {
        "_fns": [
            {
                "_fn": "xpath",
                "_args": [".//div[@class='price']"]
            },
            {
                "_fn": "xpath_one",
                "_args": ["number(text())"]
            },
            {
                "_fn": "min"
            }
        ]
    }
}
```

```json
{
    "price_min": 100.0
}
```

### `product`

#### Sample HTML

```html
<body>
    <div class="product">
        <property class="colors">
            <option class="color">Red</option>
            <option class="color">Green</option>
            <option class="color">Blue</option>
        </property>
        <property class="sizes">
            <option class="size">S</option>
            <option class="size">M</option>
            <option class="size">L</option>
            <option class="size">XL</option>
        </property>
    </div>
</body>
```

**Get the count of different product variants**

```json
{
    "number_of_variants": {
        "_fns": [
            {
                "_fn": "xpath",
                "_args": [".//property"]
            },
            {
                "_fn": "xpath",
                "_args": [".//option"]
            },
            {
                "_fn": "length"
            },
            {
                "_fn": "product"
            }
        ]
    }
}
```

```json
{
    "number_of_variants": 12
}
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://developers.oxylabs.io/products/web-scraper-api/features/custom-parser/writing-instructions-manually/list-of-functions/function-examples.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
