Other Search Engines

Scrape other search engines with our universal source. It accepts URLs along with additional parameters. You can find the list of available parameters in the table below.

Overview

Below is a quick overview of all the available data source values we support with other targets.

SourceDescriptionStructured data

universal

Submit any URL you like.

Use Custom Parser feature to get structured data.

Code examples

In the example below, we make a request to retrieve a result for the provided Baidu URL.

curl 'https://realtime.oxylabs.io/v1/queries' \
--user 'USERNAME:PASSWORD' \
-H 'Content-Type: application/json' \
-d '{
        "source": "universal",
        "url": "https://www.baidu.com/s?ie=utf-8&f=8&rsv_bp=1&rsv_idx=1&ch=&tn=baidu&bar=&wd=adidas"
    }'

The example above uses the Realtime integration method. If you would like to use some other integration method in your query (e.g. Push-Pull or Proxy Endpoint), refer to the integration methods section.

Query parameters

ParameterDescriptionDefault Value

source

Data source. More info.

universal

url

Direct URL (link) to Universal page

N/A

user_agent_type

Device type and browser. The full list can be found here.

desktop

geo_location

Geo location of proxy used to retrieve the data. The complete list of the supported locations can be found here.

N/A

render

Enables JavaScript rendering. More info.

N/A

content_encoding

Add this parameter if you are downloading images. Learn more here.

base64

context: content

Base64-encoded POST request body. It is only useful if http_method is set to post.

N/A

context: cookies

Pass your own cookies.

N/A

context: follow_redirects

Set to true to enable scraper to follow redirects. By default, redirects are followed up to a limit of 10 links, treating the entire chain as one scraping job.

N/A

context: headers

Pass your own headers.

N/A

context: http_method

Set it to post if you would like to make a POST request to your target URL.

get

context: session_id

If you want to use the same proxy with multiple requests, you can do so by using this parameter. Just set your session to any string you like, and we will assign a proxy to this ID, and keep it for up to 10 minutes. After that, if you make another request with the same session ID, a new proxy will be assigned to that particular session ID.

N/A

context: successful_status_codes

Define a custom HTTP response code (or a few of them), upon which we should consider the scrape successful and return the content to you. May be useful if you want us to return the 503 error page or in some other non-standard cases.

N/A

callback_url

URL to your callback endpoint. More info.

N/A

parse

true will return structured data as long as you define parsing_instructions

false

parsing_instructions

Define your own parsing and data transformation logic that will be executed on an HTML scraping result. Read more: Parsing instructions examples.

N/A

- required parameter

If you are observing low success rates or retrieve empty content, please try using additional "render": "html" parameter in your request. More info about render parameter can be found here.

Forming URLs

Baidu

Job parameter assignment to URL:

https://<subdomain>.baidu.<domain>/s?ie=utf-8&wd=<query>&rn=<limit>&pn=<calculated_start_page>

When forming URLs, please follow these instructions:

  1. Encoding search terms: Search terms must be URL-encoded. For instance, spaces should be replaced with %20, which represents a space character in a URL.

  2. Calculating start page: The start_page parameter now corresponds to the number of search results to skip. Use the equation limit*start_page-limit to calculate the value.

  3. Subdomain assignment: The subdomain value depends on the user agent type provided in the job. If the user agent type contains mobile, the subdomain value should be m. Otherwise, it should be www.

  4. Query parameter: Depending on the subdomain value (m or www), the query parameter for the query term should be adjusted accordingly (word for m and wd for www).

Sample Built URLs

For mobile:

https://m.baidu.com/s?ie=utf-8&word=test&rn=10&pn=20

For desktop:

https://www.baidu.cn/s?ie=utf-8&wd=test%20query&rn=13

Equivalent Job Examples

Decommissioned baidu_search source:

{
    "source": "baidu_search",
    "query": "test",
    "domain": "com",
    "limit": 5,
    "start_page": 3,
    "user_agent_type": "desktop"
}

Updated universal source:

{
    "source": "universal",
    "url": "https://www.baidu.com/s?ie=utf-8&wd=test&rn=5&pn=10",
    "user_agent_type": "desktop"
}

Yandex

Job parameter assignment to URL:

https://yandex.<domain>/search/?text=<query>&numdoc=<limit>&p=<start_page>&lr=<geo_location>

When forming URLs, please follow these instructions:

  1. Encoding search terms: Search terms must be URL encoded. For instance, spaces should be replaced with %20, which represents a space character in a URL.

  2. Start page adjustment: The value of the start_page has to be reduced by 1. For example, if the desired starting page is 3, then the value in the URL, which represents the page number, has to be 2.

  3. Localization: If the domain is either ru or tr, an additional query parameter lr is added with the geo_location value. For other domains, the geo_location value is under the query parameter rstr, where a - symbol is added before the value.

  4. Unsupported: pages parameter is no longer supported. Jobs have to be submitted separately by changing the current page value in the URL.

Built URL examples

https://yandex.ru/search/?text=test&numdoc=5&p=0&lr=100
https://yandex.com/search/?text=test%201&numdoc=10&p=2&rstr=-100

Equivalent job example

Decommissioned yandex_search source:

{
    "source": "yandex_search",
    "query": "test",
    "domain": "com",
    "limit": 5,
    "start_page": 3,
    "geo_location": 100,
    "results_language": "en"
}

Updated universal source:

{
    "source": "universal",
    "url": "https://yandex.ru/search?text=adidas&numdoc=5&p=2&lr=100&lang=en"
}

Parameter values

Geo_Location

Check the complete list of supported geo_location values here.

Here is an example:

"United Arab Emirates",
"Albania",
"Armenia",
"Angola",
"Argentina",
"Australia",
...
"Uruguay",
"Uzbekistan",
"Venezuela Bolivarian Republic of",
"Vietnam",
"South Africa",
"Zimbabwe"

HTTP_Method

Universal scraper supports two HTTP(S) methods: GET (default) and POST.

"GET",
"POST"

Last updated