# Custom Parser

Custom Parser es una función gratuita de Web Scraper API que te permite **crear lógica de análisis y procesamiento de datos** que se ejecuta sobre un resultado HTML en bruto. Puedes generar analizadores automáticamente con IA o escribirlos manualmente para escenarios avanzados.

Para instrucciones y ejemplos detallados, consulta estas páginas:

<a href="/pages/d992d3d04e5670b1c1228f0aa09dfca9b380b5ce" class="button secondary" data-icon="flag-checkered">Primeros pasos</a>  <a href="/pages/719948981073d1c60f031d22cc44705324b9215a" class="button secondary" data-icon="brain-circuit">Generación de analizadores mediante API</a>  <a href="/pages/4ba54a8ff4ae4b9bb63d0821b98b2e5c46b12798" class="button secondary" data-icon="layer-group">Presets de analizador</a>

<a href="/pages/506654b2aca075244a93441f524ae8bda80eac28" class="button secondary" data-icon="code">Escritura manual de instrucciones</a>  <a href="/pages/d79a67118b9a83d04625db94c5a04803156f17bd" class="button secondary" data-icon="list-ul">Lista de funciones de análisis</a>

***

## Inicio rápido

### 1. Genera un analizador

Recomendamos empezar con nuestra herramienta impulsada por IA [**OxyCopilot**](https://developers.oxylabs.io/scraping-solutions/web-scraper-api/web-scraper-api-playground/oxycopilot) que te permite generar scrapers y analizadores sin escribir código. /bu

{% hint style="success" %}
Para acceder a OxyCopilot, inicia sesión en el [**panel de Oxylabs**](https://developers.oxylabs.io/scraping-solutions/web-scraper-api/web-scraper-api-playground/oxycopilot) y selecciona **Scraper APIs Playground** en el menú del lado izquierdo.
{% endhint %}

Sigue los pasos mostrados en el video para **generar un analizador**:

{% embed url="<https://files.gitbook.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2FMv1sqaKQeb6ZUqst9Ehp%2Fgenerate_parser.mp4?alt=media&token=9e35fa02-842d-48da-bb52-4e2c7f9d186e>" %}

Estos son los mismos pasos mostrados en el video:

1. **Introduce la(s) URL(s)** que quieres extraer y analizar
2. **Especifica cualquier parámetro** como el renderizado de JavaScript
3. **Escribe un prompt** que describa lo que quieres analizar
4. **Ejecutar** OxyCopilot

Cuando estés satisfecho con el analizador generado, carga las instrucciones.

### 2. Guarda el analizador como un preset

Puedes guardar fácilmente tus analizadores generados a través de OxyCopilot para usarlos más adelante. Mira los pasos a continuación:

{% embed url="<https://files.gitbook.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2FrZw97isbhLa2Du9V5oKd%2Fsave_preset.mp4?alt=media&token=d7e9c4b5-755c-4175-9cb5-83c29ec37810>" %}

1. **Asigna el preset** a un usuario específico de la API
2. Haz clic en **Guardar**
3. **Introduce el nombre del preset** y la descripción (opcional)

Después de guardar el preset, puedes usarlo con solicitudes de API.

### 3. Usa el analizador con solicitudes de API

Para usar tu preset con Web Scraper API, envía un payload con el `parser_preset` parámetro establecido en el nombre de tu preset. En los ejemplos de código a continuación, estamos reutilizando el preset `example_parser` creado en los pasos anteriores.

{% tabs %}
{% tab title="cURL" %}

```shell
curl 'https://realtime.oxylabs.io/v1/queries' \
--user 'USERNAME:PASSWORD' \
-H 'Content-Type: application/json' \
-d '{
        "source": "universal",
        "url": "https://example.com/",
        "parse": true,
        "parser_preset": "example_parser"
    }'
```

{% endtab %}

{% tab title="Python" %}

```python
import requests
from pprint import pprint


# Establece el preset del analizador que se va a usar.
payload = {
    'source': 'universal',
    'url': 'https://example.com/',
    'parse': True,
    'parser_preset': 'example_parser'
}

# Obtén una respuesta.
response = requests.request(
    'POST',
    'https://realtime.oxylabs.io/v1/queries',
    auth=('USERNAME', 'PASSWORD'),
    json=payload
)

# Print prettified response to stdout.
pprint(response.json())
```

{% endtab %}

{% tab title="Node.js" %}

```javascript
const https = require("https");

const username = "USERNAME";
const password = "PASSWORD";
const body = {
    source: "universal",
    url: "https://example.com/",
    parse: true,
    parser_preset: "example_parser"
};

const options = {
    hostname: "realtime.oxylabs.io",
    path: "/v1/queries",
    method: "POST",
    headers: {
        "Content-Type": "application/json",
        Authorization:
            "Basic " + Buffer.from(`${username}:${password}`).toString("base64"),
    },
};

const request = https.request(options, (response) => {
    let data = "";

    response.on("data", (chunk) => {
        data += chunk;
    });

    response.on("end", () => {
        const responseData = JSON.parse(data);
        console.log(JSON.stringify(responseData, null, 2));
    });
});

request.on("error", (error) => {
    console.error("Error:", error);
});

request.write(JSON.stringify(body));
request.end();
```

{% endtab %}

{% tab title="HTTP" %}

```http
# La cadena completa que envíes tiene que estar codificada en URL.

https://realtime.oxylabs.io/v1/queries?source=universal&url=https%3A%2F%2Fexample.com%2F&parse=true&parser_preset=example_parser&access_token=12345abcde
```

{% endtab %}

{% tab title="PHP" %}

```php
<?php

$params = array(
    'source' => 'universal',
    'url' => 'https://example.com/',
    'parse' => true,
    'parser_preset' => 'example_parser'
);

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, "https://realtime.oxylabs.io/v1/queries");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($params));
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_USERPWD, "USERNAME" . ":" . "PASSWORD");

$headers = array();
$headers[] = "Content-Type: application/json";
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);

$result = curl_exec($ch);
echo $result;

if (curl_errno($ch)) {
    echo 'Error:' . curl_error($ch);
}
curl_close($ch);
```

{% endtab %}

{% tab title="Golang" %}

```go
package main

import (
	"bytes"
	"encoding/json"
	"fmt"
	"io/ioutil"
	"net/http"
)

func main() {
	const Username = "USERNAME"
	const Password = "PASSWORD"

	payload := map[string]interface{}{
		"source": "universal",
		"url": "https://example.com/",
		"parse": true,
		"parser_preset": "example_parser",
	}

	jsonValue, _ := json.Marshal(payload)

	client := &http.Client{}
	request, _ := http.NewRequest("POST",
		"https://realtime.oxylabs.io/v1/queries",
		bytes.NewBuffer(jsonValue),
	)

	request.SetBasicAuth(Username, Password)
	response, _ := client.Do(request)

	responseText, _ := ioutil.ReadAll(response.Body)
	fmt.Println(string(responseText))
}

```

{% endtab %}

{% tab title="C#" %}

```csharp
using System;
using System.Collections.Generic;
using System.Net.Http;
using System.Net.Http.Json;
using System.Threading.Tasks;

namespace OxyApi
{
    class Program
    {
        static async Task Main()
        {
            const string Username = "USERNAME";
            const string Password = "PASSWORD";

            var parameters = new {
                source = "universal",
                url = "https://example.com/",
                parse = true,
                parser_preset = "example_parser"
            };

            var client = new HttpClient();

            Uri baseUri = new Uri("https://realtime.oxylabs.io");
            client.BaseAddress = baseUri;

            var requestMessage = new HttpRequestMessage(HttpMethod.Post, "/v1/queries");
            requestMessage.Content = JsonContent.Create(parameters);

            var authenticationString = $"{Username}:{Password}";
            var base64EncodedAuthenticationString = Convert.ToBase64String(System.Text.ASCIIEncoding.UTF8.GetBytes(authenticationString));
            requestMessage.Headers.Add("Authorization", "Basic " + base64EncodedAuthenticationString);

            var response = await client.SendAsync(requestMessage);
            var contents = await response.Content.ReadAsStringAsync();

            Console.WriteLine(contents);
        }
    }
}
```

{% endtab %}

{% tab title="Java" %}

```java
package org.example;

import okhttp3.*;
import org.json.JSONObject;
import java.util.concurrent.TimeUnit;

public class Main implements Runnable {
    private static final String AUTHORIZATION_HEADER = "Authorization";
    public static final String USERNAME = "USERNAME";
    public static final String PASSWORD = "PASSWORD";

    public void run() {
        JSONObject jsonObject = new JSONObject();
        jsonObject.put("source", "universal");
        jsonObject.put("url", "https://example.com/");
        jsonObject.put("parse", true);
        jsonObject.put("parser_preset", "example_parser");

        Authenticator authenticator = (route, response) -> {
            String credential = Credentials.basic(USERNAME, PASSWORD);
            return response
                    .request()
                    .newBuilder()
                    .header(AUTHORIZATION_HEADER, credential)
                    .build();
        };

        var client = new OkHttpClient.Builder()
                .authenticator(authenticator)
                .readTimeout(180, TimeUnit.SECONDS)
                .build();

        var mediaType = MediaType.parse("application/json; charset=utf-8");
        var body = RequestBody.create(jsonObject.toString(), mediaType);
        var request = new Request.Builder()
                .url("https://realtime.oxylabs.io/v1/queries")
                .post(body)
                .build();

        try (var response = client.newCall(request).execute()) {
            if (response.body() != null) {
                try (var responseBody = response.body()) {
                    System.out.println(responseBody.string());
                }
            }
        } catch (Exception exception) {
            System.out.println("Error: " + exception.getMessage());
        }

        System.exit(0);
    }

    public static void main(String[] args) {
        new Thread(new Main()).start();
    }
}
```

{% endtab %}

{% tab title="JSON" %}

```json
{
    "source": "universal",
    "url": "https://example.com/",
    "parse": true,
    "parser_preset": "example_parser"
}
```

{% endtab %}
{% endtabs %}

<details>

<summary>Ejemplo de salida</summary>

```json
{
  "results": [
    {
      "content": {
        "title": "Example Domain",
        "parse_status_code": 12000
      },
      "created_at": "2025-10-24 10:04:59",
      "updated_at": "2025-10-24 10:05:00",
      "page": 1,
      "url": "https://example.com/",
      "job_id": "7387428891226308609",
      "is_render_forced": false,
      "status_code": 200,
      "type": "parsed",
      "parser_type": "preset",
      "parser_preset": "example_parser"
    }
  ]
}
```

</details>

## Obteniendo el contenido HTML de un trabajo analizado

También puedes recuperar el resultado HTML en bruto añadiendo `?type=raw` al final de la URL de recuperación del resultado. Leer más [**aquí**](/products/es/web-scraper-api/integration-methods/push-pull.md#endpoints).


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://developers.oxylabs.io/products/es/web-scraper-api/features/custom-parser.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
