# 任意域名

探索我们专为以下内容打造的网页爬虫API指南： [电子商务网站](/api-targets/cn/dian-zi-shang-wu.md), [搜索引擎](/api-targets/cn/sou-suo-yin-qing.md), [LLMs 与 AI](/api-targets/cn/llm-he-ai.md), [视频数据](/api-targets/cn/shi-pin-yu-she-jiao-mei-ti.md), [房地产](/api-targets/cn/fang-di-chan.md) 平台，或者使用我们的 `通用` 使用下面的指南通过源抓取。它接受 URL 以及 [附加参数](#additional).

## 请求样本

在此示例中，API 将检索一个电子商务产品页面。

{% tabs %}
{% tab title="cURL" %}

```shell
curl 'https://realtime.oxylabs.io/v1/queries' \\
--user 'USERNAME:PASSWORD' \\
-H 'Content-Type: application/json' \\
-d '{
        "source": "universal",
        "url": "https://sandbox.oxylabs.io/products/1"
    }'
```

{% endtab %}

{% tab title="Python" %}

```python
import requests
from pprint import pprint


# 结构化负载。
payload = {
    'source': 'universal',
    'url': 'https://sandbox.oxylabs.io/products/1',
}

# 获取响应。
response = requests.request(
    'POST',
    'https://realtime.oxylabs.io/v1/queries',
    auth=('USERNAME', 'PASSWORD'),
    json=payload,
)

# 不会返回作业状态和结果 URL 的响应，而是会返回
# 包含结果的 JSON 响应。
pprint(response.json())
```

{% endtab %}

{% tab title="Node.js" %}

```javascript
const https = require("https");

const username = "USERNAME";
const password = "PASSWORD";
const body = {
    source: "universal",
    url: "https://sandbox.oxylabs.io/products/1",
};

const options = {
    hostname: "realtime.oxylabs.io",
    path: "/v1/queries",
    method: "POST",
    headers: {
        "Content-Type": "application/json",
        Authorization:
            "Basic " + Buffer.from(`${username}:${password}`).toString("base64"),
    },
};

const request = https.request(options, (response) => {
    let data = "";

    response.on("data", (chunk) => {
        data += chunk;
    });

    response.on("end", () => {
        const responseData = JSON.parse(data);
        console.log(JSON.stringify(responseData, null, 2));
    });
});

request.on("error", (error) => {
    console.error("错误：", error);
});

request.write(JSON.stringify(body));
request.end();
```

{% endtab %}

{% tab title="HTTP" %}

```http
# 你提交的整个字符串必须进行 URL 编码。

https://realtime.oxylabs.io/v1/queries?source=universal&url=https%3A%2F%2Fsandbox.oxylabs.io%2Fproducts%2F1&access_token=12345abcde
```

{% endtab %}

{% tab title="PHP" %}

```php
<?php

$params = array(
    'source' => 'universal',
    'url' => 'https://sandbox.oxylabs.io/products/1',
);

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, "https://realtime.oxylabs.io/v1/queries");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($params));
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_USERPWD, "USERNAME" . ":" . "PASSWORD");

$headers = array();
$headers[] = "Content-Type: application/json";
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);

$result = curl_exec($ch);
echo $result;

if (curl_errno($ch)) {
    echo '错误:' . curl_error($ch);
}
curl_close($ch);
```

{% endtab %}

{% tab title="Golang" %}

```go
package main

import (
	"bytes"
	"encoding/json"
	"fmt"
	"io/ioutil"
	"net/http"
)

func main() {
	const Username = "USERNAME"
	const Password = "PASSWORD"

	payload := map[string]interface{} {
		"source": "universal",
		"url":    "https://sandbox.oxylabs.io/products/1",
	}

	jsonValue, _ := json.Marshal(payload)

	client := &http.Client{}
	request, _ := http.NewRequest("POST",
		"https://realtime.oxylabs.io/v1/queries",
		bytes.NewBuffer(jsonValue),
	)

	request.SetBasicAuth(Username, Password)
	response, _ := client.Do(request)

	responseText, _ := ioutil.ReadAll(response.Body)
	fmt.Println(string(responseText))
}

```

{% endtab %}

{% tab title="C#" %}

```csharp
using System;
using System.Collections.Generic;
using System.Net.Http;
using System.Net.Http.Json;
using System.Threading.Tasks;

namespace OxyApi
{
    class Program
    {
        static async Task Main()
        {
            const string Username = "USERNAME";
            const string Password = "PASSWORD";

            var parameters = new {
                source = "universal",
                url = "https://sandbox.oxylabs.io/products/1"
            };

            var client = new HttpClient();

            Uri baseUri = new Uri("https://realtime.oxylabs.io");
            client.BaseAddress = baseUri;

            var requestMessage = new HttpRequestMessage(HttpMethod.Post, "/v1/queries");
            requestMessage.Content = JsonContent.Create(parameters);

            var authenticationString = $"{Username}:{Password}";
            var base64EncodedAuthenticationString = Convert.ToBase64String(System.Text.ASCIIEncoding.UTF8.GetBytes(authenticationString));
            requestMessage.Headers.Add("Authorization", "Basic " + base64EncodedAuthenticationString);

            var response = await client.SendAsync(requestMessage);
            var contents = await response.Content.ReadAsStringAsync();

            Console.WriteLine(contents);
        }
    }
}
```

{% endtab %}

{% tab title="Java" %}

```java
package org.example;

import okhttp3.*;
import org.json.JSONObject;
import java.util.concurrent.TimeUnit;

public class Main implements Runnable {
    private static final String AUTHORIZATION_HEADER = "Authorization";
    public static final String USERNAME = "USERNAME";
    public static final String PASSWORD = "PASSWORD";

    public void run() {
        JSONObject jsonObject = new JSONObject();
        jsonObject.put("source", "universal");
        jsonObject.put("url", "https://sandbox.oxylabs.io/products/1");

        Authenticator authenticator = (route, response) -> {
            String credential = Credentials.basic(USERNAME, PASSWORD);
            return response
                    .request()
                    .newBuilder()
                    .header(AUTHORIZATION_HEADER, credential)
                    .build();
        };

        var client = new OkHttpClient.Builder()
                .authenticator(authenticator)
                .readTimeout(180, TimeUnit.SECONDS)
                .build();

        var mediaType = MediaType.parse("application/json; charset=utf-8");
        var body = RequestBody.create(jsonObject.toString(), mediaType);
        var request = new Request.Builder()
                .url("https://realtime.oxylabs.io/v1/queries")
                .post(body)
                .build();

        try (var response = client.newCall(request).execute()) {
            if (response.body() != null) {
                try (var responseBody = response.body()) {
                    System.out.println(responseBody.string());
                }
            }
        } catch (Exception exception) {
            System.out.println("错误： " + exception.getMessage());
        }

        System.exit(0);
    }

    public static void main(String[] args) {
        new Thread(new Main()).start();
    }
}
```

{% endtab %}

{% tab title="JSON" %}

```json
{
    "source": "universal", 
    "url": "https://sandbox.oxylabs.io/products/1"
}
```

{% endtab %}
{% endtabs %}

<details>

<summary>输出示例</summary>

```json
{
    "results": [
        {
            "content": "<!DOCTYPE html><html lang=\"en\">
            内容
            </html>",
            "created_at": "2024-07-01 11:35:14",
            "updated_at": "2024-07-01 11:35:15",
            "page": 1,
            "url": "https://sandbox.oxylabs.io/products/1",
            "job_id": "7213505428280329217",
            "status_code": 200
        }
    ]
}
```

</details>

我们在示例中使用同步 [**Realtime**](/products/cn/web-scraper-api/integration-methods/realtime.md) 集成方法。如果您想使用 [**Proxy Endpoint**](/products/cn/web-scraper-api/integration-methods/proxy-endpoint.md) 或异步 [**Push-Pull**](/products/cn/web-scraper-api/integration-methods/push-pull.md) 集成，请参阅 [**集成方法**](/products/cn/web-scraper-api/integration-methods.md) 部分。

## 请求参数值

### 通用

<table><thead><tr><th width="205">参数</th><th width="289.3333333333333">描述</th><th>默认值</th></tr></thead><tbody><tr><td><mark style="background-color:green;"><strong>source</strong></mark></td><td>设置爬虫。</td><td><code>通用</code></td></tr><tr><td><mark style="background-color:green;"><strong>url</strong></mark></td><td>任何页面的直接 URL（链接）。</td><td>-</td></tr><tr><td><code>callback_url</code></td><td>回调端点的 URL。 <a href="/spaces/ZwEHB9k4MH4pDy80n9mF/pages/f93fe40aed5366f8033cd2ebfae30e61c16a4f51"><strong>更多信息</strong></a>.</td><td>-</td></tr></tbody></table>

&#x20;    \- 必填参数

### 附加

这些是我们的 [**功能参数**](/products/cn/web-scraper-api/features.md).

<table><thead><tr><th width="253">参数</th><th width="338.92746113989637">描述</th><th>默认值</th></tr></thead><tbody><tr><td><code>geo_location</code></td><td>将代理的地理位置设置为检索数据。查找受支持的位置 <a href="/spaces/ZwEHB9k4MH4pDy80n9mF/pages/6f244fa2ed8311c561bf964b7afd534718285b6f#list-of-supported-geo_location-values"><strong>这里</strong></a>.</td><td>-</td></tr><tr><td><code>render</code></td><td>设置为时启用 JavaScript 渲染 <code>html</code>. <a href="/spaces/ZwEHB9k4MH4pDy80n9mF/pages/9d7133837001de31de5dfd0796cfbc6fdd7c78c8#javascript-rendering"><strong>更多信息</strong></a><strong>.</strong> 注意：如果您观察到较低的成功率或检索到空内容，请尝试添加此参数。</td><td>-</td></tr><tr><td><code>browser_instructions</code></td><td>定义您自己的浏览器指令，这些指令会在渲染 JavaScript 时执行。 <a href="/spaces/ZwEHB9k4MH4pDy80n9mF/pages/9d7133837001de31de5dfd0796cfbc6fdd7c78c8#browser-instructions"><strong>更多信息</strong></a>.</td><td>-</td></tr><tr><td><code>parse</code></td><td>设置为时返回解析后的数据 <code>true</code>，只要提交的 URL 页面类型存在专用解析器即可。</td><td><code>false</code></td></tr><tr><td><code>parsing_instructions</code></td><td>定义您自己的解析和数据转换逻辑，这些逻辑将在 HTML 抓取结果上执行。阅读更多： <a href="/spaces/ZwEHB9k4MH4pDy80n9mF/pages/f0599592eb013357024a5311a8b39c2e63e3cf58"><strong>解析指令示例</strong></a><strong>.</strong></td><td>-</td></tr><tr><td><code>context</code>:<br><code>headers</code></td><td>传递您自己的 headers。了解更多 <a href="/spaces/ZwEHB9k4MH4pDy80n9mF/pages/0d005c0006234a7653b0dd1e2afb84d47bc9ee73#custom-headers"><strong>这里</strong></a>.</td><td>-</td></tr><tr><td><code>context</code>:<br><code>Cookie</code></td><td>传递您自己的 Cookie。了解更多 <a href="/spaces/ZwEHB9k4MH4pDy80n9mF/pages/0d005c0006234a7653b0dd1e2afb84d47bc9ee73#custom-cookies"><strong>这里</strong></a>.</td><td>-</td></tr><tr><td><code>context</code>:<br><code>session_id</code></td><td>如果您希望通过多个请求使用同一个代理，可以使用此参数。只需将您的 session 设置为任意字符串，我们就会为该 ID 分配一个代理，并最多保留 10 分钟。之后，如果您使用相同的 session ID 发起另一个请求，将会为该特定 session ID 分配一个新的代理。</td><td>-</td></tr><tr><td><code>context</code>:<br><code>http_method</code></td><td>将其设置为 <code>post</code> 如果您想发起一个 <code>POST</code> 通过 电商爬虫 API 向您的目标 URL 发起请求。了解更多 <a href="/spaces/ZwEHB9k4MH4pDy80n9mF/pages/0d005c0006234a7653b0dd1e2afb84d47bc9ee73#http-method"><strong>这里</strong></a>.</td><td><code>获取</code></td></tr><tr><td><code>user_agent_type</code></td><td>设备类型和浏览器。完整列表可在 <a href="/spaces/ZwEHB9k4MH4pDy80n9mF/pages/3e6a8ee6a2915a55b276cc31a20735fe1e0e4ed1"><strong>这里</strong></a>.</td><td><code>desktop</code></td></tr><tr><td><code>context</code>:<br><code>内容</code></td><td>Base64 编码的 <code>POST</code> 请求体。仅在以下情况下有用： <code>http_method</code> 设置为 <code>post</code>.</td><td>-</td></tr><tr><td><code>content_encoding</code></td><td>如果您正在下载图片，请添加此参数。了解更多 <a href="/spaces/ZwEHB9k4MH4pDy80n9mF/pages/9135619b95dda1a4a2ec3526e310f3c739b2fd32"><strong>这里</strong></a>.</td><td><code>base64</code></td></tr><tr><td><code>context</code>:<br><code>follow_redirects</code></td><td>设置为 <code>true</code> 以启用爬虫跟随重定向。默认情况下，重定向会最多跟随 10 个链接，并将整个链路视为一个爬取任务。</td><td><code>true</code></td></tr><tr><td><code>context</code>:<br><code>successful_status_codes</code></td><td>定义一个自定义的 HTTP 响应代码（或其中几个），当返回这些代码时，我们应将爬取视为成功并向您返回内容。如果您希望我们返回 503 错误页面，或在其他非标准情况下，这可能会很有用。</td><td>-</td></tr></tbody></table>

**所有参数**

在此示例中，包含了所有可用参数（尽管并非总是必要，或不一定能在同一请求中兼容），以便让您了解如何格式化您的请求。

{% code fullWidth="false" %}

```json
{
    "source": "universal", 
    "url": "https://example.com", 
    "user_agent_type": "desktop",
    "geo_location": "美国",
    "parse": true,
    "context": [
        {
            "key": "headers", 
            "value": {
                "Content-Type": "application/octet-stream", 
                "Custom-Header-Name": "自定义请求头内容"
            }
        }, 
        {
            "key": "Cookie", 
            "value": [
                {
                    "key": "NID", 
                    "value": "1234567890"
                },
                {
                    "key": "1P JAR",
                    "value": "0987654321"
                }]
        },
        {
            "key": "follow_redirects",
            "value": true
        },
        {
            "key": "http_method", "value": "get"
        },
        {
            "key": "content",
            "value": "YmFzZTY0RW5jb2RlZFBPU1RCb2R5"
        },
        {
            "key": "successful_status_codes",
            "value": [808, 909]
        }]
}
```

{% endcode %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://developers.oxylabs.io/api-targets/cn/ren-yi-yu-ming.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
