Proxy Endpoint

通过 Oxylabs 网页爬虫 API Proxy Endpoint 发送和接收数据。通过简单的基于 URL 的集成直接访问目标页面。

如果您曾使用常规代理进行数据抓取，集成 Proxy Endpoint 传输方式会非常简单。您只需将我们的入口节点用作代理，使用 Scraper API 凭证进行授权，并忽略证书。在 cURL，它是 -k 或 --insecure。您的数据将在一个开放连接上到达您。

Proxy Endpoint 仅适用于基于 URL 的数据源，在这些数据源中提供完整的 URL。因此，它只接受少量额外的作业参数， 应以请求头的形式发送.

该产品并非设计为直接与自定义浏览器指令（例如 Chromium、PhantomJS、Splash 等）及其驱动（例如 Playwright、Selenium、Puppeteer 等）一起使用。

Endpoint

GET realtime.oxylabs.io:60000

输入

请参见下面的请求示例。

curl -k -x https://realtime.oxylabs.io:60000 \
-U 'USERNAME:PASSWORD' \
-H 'x-oxylabs-user-agent-type: desktop_chrome' \
-H 'x-oxylabs-geo-location: Germany' \
'https://www.example.com'

import requests
from pprint import pprint

# 在此处使用您的 SERP API 凭证。
USERNAME, PASSWORD = 'YOUR_USERNAME', 'YOUR_PASSWORD'

# 定义 proxy 字典。
proxies = {
  'http': f'http://{USERNAME}:{PASSWORD}@realtime.oxylabs.io:60000',
  'https': f'https://{USERNAME}:{PASSWORD}@realtime.oxylabs.io:60000'
}

# 若要设置特定的地理位置、User-Agent 或渲染 Javascript，
# 需要将参数作为请求头发送。
headers = {
    'x-oxylabs-user-agent-type': 'desktop_chrome',
    'x-oxylabs-geo-location': 'Germany',
    #'X-Oxylabs-Render': 'html', # 如果您希望在页面内渲染 JavaScript，请取消注释。
}

response = requests.request(
    'GET',
    'https://www.example.com',
    headers = headers, # 传入已定义的头。
    verify=False,  # 接受我们的证书。
    proxies=proxies,
)

# 将结果页面打印到标准输出。
pprint(response.text)

# 将返回的 HTML 保存到 'result.html' 文件。
with open('result.html', 'w') as f:
    f.write(response.text)

import fetch from 'node-fetch';
import { HttpsProxyAgent } from 'https-proxy-agent';

const username = 'YOUR_USERNAME';
const password = 'YOUR_PASSWORD';

const agent = new HttpsProxyAgent(
  `https://${username}:${password}@realtime.oxylabs.io:60000`
);

// 我们建议接受我们的证书，而不是允许不安全的（http）流量
process.env['NODE_TLS_REJECT_UNAUTHORIZED'] = 0;

const headers = {
  'x-oxylabs-user-agent-type': 'desktop_chrome',
  'x-oxylabs-geo-location': 'Germany',
}

const response = await fetch('https://www.example.com', {
  method: 'get',
  headers: headers,
  agent: agent,
});

console.log(await response.text());

<?php
$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, 'https://www.example.com/');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_PROXY, 'https://realtime.oxylabs.io:60000');
curl_setopt($ch, CURLOPT_PROXYUSERPWD, 'YOUR_USERNAME' . ':' . 'YOUR_PASSWORD');
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
// 若要设置特定的地理位置、User-Agent 或渲染 Javascript，
// 需要将参数作为请求头发送。
curl_setopt_array($ch, array(
    CURLOPT_HTTPHEADER  => array(
        'x-oxylabs-user-agent-type: desktop_chrome',
        'x-oxylabs-geo-location: Germany',
        //'X-Oxylabs-Render: html', // 如果您希望在页面内渲染 JavaScript，请取消注释。
    )
));

$result = curl_exec($ch);
echo $result;

if (curl_errno($ch)) {
    echo 'Error:' . curl_error($ch);
}
curl_close ($ch);
?>

package main

import (
	"crypto/tls"
	"fmt"
	"io/ioutil"
	"net/http"
	"net/url"
)

func main() {
	const Username = "YOUR_USERNAME"
	const Password = "YOUR_PASSWORD"

	proxyUrl, _ := url.Parse(
		fmt.Sprintf(
			"https://%s:%[email protected]:60000",
			Username,
			Password,
		),
	)
	customTransport := &http.Transport{Proxy: http.ProxyURL(proxyUrl)}

	// 我们建议接受我们的证书，而不是允许不安全的（http）流量
	customTransport.TLSClientConfig = &tls.Config{InsecureSkipVerify: true}

	client := &http.Client{Transport: customTransport}
	request, _ := http.NewRequest("GET",
		"https://www.example.com",
		nil,
	)

	request.Header.Add("x-oxylabs-user-agent-type", "desktop_chrome")
	request.Header.Add("x-oxylabs-geo-location", "Germany")
	request.SetBasicAuth(Username, Password)
	response, _ := client.Do(request)

	responseText, _ := ioutil.ReadAll(response.Body)
	fmt.Println(string(responseText))
}

using System;
using System.Net;
using System.Net.Http;
using System.Threading.Tasks;

namespace OxyApi
{
    class Program
    {
        static async Task Main(string[] args)
        {
            var webProxy = new WebProxy
            {
                Address = new Uri($"https://realtime.oxylabs.io:60000"),
                BypassProxyOnLocal = false,
                UseDefaultCredentials = false,

                Credentials = new NetworkCredential(
                userName: "YOUR_USERNAME",
                password: "YOUR_PASSWORD"
                )
            };

            var httpClientHandler = new HttpClientHandler
            {
                Proxy = webProxy,
            };

            // 我们建议接受我们的证书，而不是允许不安全的（http）流量
            httpClientHandler.ClientCertificateOptions = ClientCertificateOption.Manual;
            httpClientHandler.ServerCertificateCustomValidationCallback =
                (httpRequestMessage, cert, cetChain, policyErrors) =>
                {
                    return true;
                };


            var client = new HttpClient(handler: httpClientHandler, disposeHandler: true);

            client.DefaultRequestHeaders.Add("x-oxylabs-user-agent-type", "desktop_chrome");
            client.DefaultRequestHeaders.Add("x-oxylabs-geo-location", "Germany");

            Uri baseUri = new Uri("https://www.example.com");
            client.BaseAddress = baseUri;

            var requestMessage = new HttpRequestMessage(HttpMethod.Get, "");

            var response = await client.SendAsync(requestMessage);
            var contents = await response.Content.ReadAsStringAsync();

            Console.WriteLine(contents);
        }
    }
}

package org.example;

import org.apache.hc.client5.http.auth.AuthScope;
import org.apache.hc.client5.http.auth.CredentialsProvider;
import org.apache.hc.client5.http.classic.methods.HttpGet;
import org.apache.hc.client5.http.config.RequestConfig;
import org.apache.hc.client5.http.impl.auth.CredentialsProviderBuilder;
import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import org.apache.hc.client5.http.impl.io.PoolingHttpClientConnectionManagerBuilder;
import org.apache.hc.client5.http.ssl.NoopHostnameVerifier;
import org.apache.hc.client5.http.ssl.SSLConnectionSocketFactoryBuilder;
import org.apache.hc.client5.http.ssl.TrustAllStrategy;
import org.apache.hc.core5.http.HttpHost;
import org.apache.hc.core5.http.io.entity.EntityUtils;
import org.apache.hc.core5.http.message.StatusLine;
import org.apache.hc.core5.ssl.SSLContextBuilder;

import java.util.Arrays;
import java.util.Properties;


public class Main {

    public static void main(final String[] args)throws Exception {
        final CredentialsProvider credsProvider = CredentialsProviderBuilder.create()
                .add(new AuthScope("realtime.oxylabs.io", 60000), "USERNAME", "PASSWORD".toCharArray())
                .build();
        final HttpHost target = new HttpHost("https", "example.com", 443);
        final HttpHost proxy = new HttpHost("https", "realtime.oxylabs.io", 60000);
        try (final CloseableHttpClient httpclient = HttpClients.custom()
                .setDefaultCredentialsProvider(credsProvider)
                .setProxy(proxy)
                // 我们建议接受我们的证书，而不是允许不安全的（http）流量
                .setConnectionManager(PoolingHttpClientConnectionManagerBuilder.create()
                        .setSSLSocketFactory(SSLConnectionSocketFactoryBuilder.create()
                                .setSslContext(SSLContextBuilder.create()
                                        .loadTrustMaterial(TrustAllStrategy.INSTANCE)
                                        .build())
                                .setHostnameVerifier(NoopHostnameVerifier.INSTANCE)
                                .build())
                        .build())
                .build()) {

            final RequestConfig config = RequestConfig.custom()
                    .build();
            final HttpGet request = new HttpGet("/");
            request.addHeader("x-oxylabs-user-agent-type","desktop_chrome");
            request.addHeader("x-oxylabs-geo-location","Germany");
            request.setConfig(config);

            System.out.println("Executing request " + request.getMethod() + " " + request.getUri() +
                    " via " + proxy + " headers: " + Arrays.toString(request.getHeaders()));

            httpclient.execute(target, request, response -> {
                System.out.println("----------------------------------------");
                System.out.println(request + "->" + new StatusLine(response));
                EntityUtils.consume(response.getEntity());
                return null;
            });
        }
    }
}

输出

下面您会看到来自 https://example.com:

示例响应

<!doctype html>
<html>
<head>
    <title>Example Domain</title>

    <meta charset="utf-8" />
    <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1" />
    <style type="text/css">
    body {
        background-color: #f0f0f2;
        margin: 0;
        padding: 0;
        font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;
        
    }
    div {
        width: 600px;
        margin: 5em auto;
        padding: 2em;
        background-color: #fdfdff;
        border-radius: 0.5em;
        box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);
    }
    a:link, a:visited {
        color: #38488f;
        text-decoration: none;
    }
    @media (max-width: 700px) {
        div {
            margin: 0 auto;
            width: auto;
        }
    }
    </style>    
</head>

<body>
<div>
    <h1>Example Domain</h1>
    <p>This domain is for use in illustrative examples in documents. You may use this
    domain in literature without prior coordination or asking for permission.</p>
    <p><a href="https://www.iana.org/domains/example">More information...</a></p>
</div>
</body>
</html>

接受的参数

在发出请求时，除了 URL，您还可以向我们发送一些作业参数，这些参数将在执行您的作业时使用。作业参数应在请求头中发送 - 请参见示例这里.

以下是您可以随 Proxy Endpoint 请求一起发送的作业参数列表：

参数

说明

x-oxylabs-user-agent-type

无法指明特定的 User-Agent，但您可以告知我们希望使用的 user-agent 类型。受支持的 User-Agent 类型列表可在这里.

x-oxylabs-geo-location

在某些情况下，您可能需要指示结果应适配的地理位置。此参数对应于 geo_location 参数，已在源级文档中单独描述。接受的值取决于您希望我们抓取的 URL。阅读更多这里.

x-oxylabs-render

JavaScript 执行。阅读更多这里.

最后更新于13天前

这有帮助吗？

下午好

hashtagEndpoint

hashtag输入

hashtag输出

hashtag接受的参数

Endpoint

输入

输出

接受的参数