SERP Scraper API can render JavaScript when scraping. This is necessary in some Google services, such as Travel.
Below is a quick overview of all the available data
source
values we support with Google.Source | Description | Structured data |
---|---|---|
google | Submit any Google URL you like. | Depends on the URL. |
google_search | SERPs. | Yes. |
google_ads | SERPs, optimized for maximum ad rate. num=10 only. | Yes. |
google_hotels | Legacy Hotels service. | No. |
google_travel_hotels | Travel: Hotels service. | No. |
google_images | Reverse Image Search. | Yes. |
google_suggest | Autocomplete search term suggestions. | Yes. |
google_trends_explore | Trends. | Yes. |
You can jump to your preferred Google page type by selecting its name on the right hand side menu. Each page contains the parameter table as well as code examples to help you get started with your query.
The
google
source is designed to retrieve content from various Google URLs. This means that instead of sending multiple parameters, you can provide us with a direct URL to the required Google page. We do not strip any parameters or alter your URLs in any other way.This data source also supports parsed data (structured data in JSON format), as long as the URL submitted is for Google (SERP page). If we cannot confirm this is a SERP page request, we will return a failure message.
Parameter | Description | Default Value |
---|---|---|
source | google | |
url | Direct URL (link) to Google page | - |
user_agent_type | desktop | |
render | | |
callback_url | - | |
geo_location | The results will be adapted for geographical location. Using this parameter correctly is extremely important to get accurate data. For more information, read about our suggested geo_location parameter structures here. | - |
parse | true will return parsed data, as long as the URL submitted is for Google. | - |
- required parameter
In the example below, the API will retrieve a Google Scholar search page.
JSON
cURL
Python
PHP
HTTP
{
"source": "google",
"url": "https://scholar.google.com/scholar?hl=en&q=newton&btnG=&as_sdt=1%2C5&as_sdtp="
}
curl --user user:pass1 'https://realtime.oxylabs.io/v1/queries' -H "Content-Type: application/json" -d '{"source": "google", "url": "https://scholar.google.com/scholar?hl=en&q=newton&btnG=&as_sdt=1%2C5&as_sdtp="}'
import requests
from pprint import pprint
# Structure payload.
payload = {
'source': 'google',
'url': 'https://scholar.google.com/scholar?hl=en&q=newton&btnG=&as_sdt=1%2C5&as_sdtp='
}
# Get response.
response = requests.request(
'POST',
'https://realtime.oxylabs.io/v1/queries',
auth=('user', 'pass1'),
json=payload,
)
# Instead of response with job status and results url, this will return the
# JSON response with results.
pprint(response.json())
<?php
$params = [
'source' => 'google',
'url' => 'https://scholar.google.com/scholar?hl=en&q=newton&btnG=&as_sdt=1%2C5&as_sdtp='
];
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "https://realtime.oxylabs.io/v1/queries");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($params));
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_USERPWD, "user" . ":" . "pass1");
$headers = [];
$headers[] = "Content-Type: application/json";
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
$result = curl_exec($ch);
echo $result;
if (curl_errno($ch)) {
echo 'Error:' . curl_error($ch);
}
curl_close ($ch);
# URL has to be encoded to escape `&` and `=` characters:
# URL: https://scholar.google.com/scholar?hl=en&q=newton&btnG=&as_sdt=1%2C5&as_sdtp=
# Encoded URL: https%3A%2F%2Fscholar.google.com%2Fscholar%3Fhl%3Den%26q%3Dnewton%26btnG%3D%26as_sdt%3D1%252C5%26as_sdtp%3D
https://realtime.oxylabs.io/v1/queries?source=google&url=https%3A%2F%2Fscholar.google.com%2Fscholar%3Fhl%3Den%26q%3Dnewton%26btnG%3D%26as_sdt%3D1%252C5%26as_sdtp%3D&access_token=12345abcde
The example above uses the Realtime integration method. If you would like to use some other integration method in your query (e.g. Push-Pull or Proxy Endpoint), refer to the integration methods section.
The
google_search
source is designed to retrieve Google Search results (SERPs).Parameter | Description | Default Value |
---|---|---|
source | google_search | |
domain | Domain localization | com |
query | UTF-encoded keyword | - |
start_page | Starting page number | 1 |
pages | Number of pages to retrieve | 1 |
limit | Number of results to retrieve in each page | 10 |
locale | Accept-Language header value which changes your Google search page web interface language. More info. | - |
geo_location | The geographical location that the result should be adapted for. Using this parameter correctly is extremely important to get the right data. For more information, read about our suggested geo_location parameter structures here. | - |
user_agent_type | desktop | |
render | | |
callback_url | - | |
parse | - | |
context :filter | Setting the value of this param to 0 lets you see results that would otherwise be excluded due to similarity to other results. | 1 |
context :
fpstate | Setting the fpstate value to aig will make Google load more apps. This parameter is only useful if used together with the render parameter. | - |
context :limit_per_page | If you want to scrape multiple pages with the same IP address, include a JSON array and specify the page numbers using the page key. You must also indicate the number of organic results on each page by adding a limit key. See example. | - |
context :
nfpr | true will turn off spelling auto-correction | false |
context :
results_language | - | |
context :safe_search | Safe search. Set to true to enable it. | false |
context :
tbm | To-be-matched or tbm parameter. Accepted values are: app , blg , bks , dsc , isch , nws , pts , plcs , rcp , lcl | - |
context :
tbs | tbs parameter. This parameter is like a container for more obscure google parameters, like limiting/sorting results by date as well as other filters some of which depend on the tbm parameter (e.g. tbs=app_os:1 is only available with tbm value app ). More info here. | - |
- required parameter
In the example below, we make a request to get
2
results pages, from number 11
to number 12
, for search term adidas
on google.nl
domain. The SERP will be filtered to contain French-language results only.JSON
cURL
Python
PHP
HTTP
{
"source": "google_search",
"domain": "nl",
"query": "adidas",
"start_page": 11,
"pages": 2,
"parse": true,
"context": [
{
"key": "results_language",
"value": "fr"
}]
}
curl --user user:pass1 'https://realtime.oxylabs.io/v1/queries' -H "Content-Type: application/json"-d '{"source": "google_search", "domain": "nl", "query": "adidas", "start_page": 11, "pages": 2, "parse": true, "context": [{"key": "results_language", "value": "fr"}]}'
import requests
from pprint import pprint
# Structure payload.
payload = {
'source': 'google_search',
'domain': 'nl',
'query': 'adidas',
'start_page': 11,
'pages': 2,
'context': [
{'key': 'results_language', 'value': 'fr'},
],
}
# Get response.
response = requests.request(
'POST',
'https://realtime.oxylabs.io/v1/queries',
auth=('user', 'pass1'),
json=payload,
)
# Print prettified response to stdout.
pprint(response.json())
<?php
$params = [
'source' => 'google_search',
'domain' => 'nl',
'query' => 'adidas',
'start_page' => 11,
'pages' => 2,
'context' => [
[
'key' => 'results_language',
'value' => 'fr'
]
]
];
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "https://realtime.oxylabs.io/v1/queries");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($params));
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_USERPWD, "user" . ":" . "pass1");
$headers = [];
$headers[] = "Content-Type: application/json";
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
$result = curl_exec($ch);
echo $result;
if (curl_errno($ch)) {
echo 'Error:' . curl_error($ch);
}
curl_close($ch);
https://realtime.oxylabs.io/v1/queries?source=google_search&domain=nl&query=adidas&start_page=11&pages=2&context[0][key]=results_language&context[0][value]=fr&access_token=12345abcde
The example above uses the Realtime integration method. If you would like to use some other integration method in your query (e.g. Push-Pull or Proxy Endpoint), refer to the integration methods section.
The
google_ads
source is optimized to retrieve Google Search results page (SERP) with paid ads. This source will return only ten results per page, ensuring the highest chances of paid results showing up. Other than that, it supports the same parameters as regular Search.Parameter | Description | Default Value |
---|---|---|
source | google_ads | |
domain | com | |
query | UTF-encoded keyword | - |
start_page | Starting page number | 1 |
pages | Number of pages to retrieve | 1 |
locale | Accept-Language header value which changes your Google search page web interface language. More info. | - |
geo_location | The geographical location that the result should be adapted for. Using this parameter correctly is extremely important to get the right data. For more information, read about our suggested geo_location parameter structures here. | - |
user_agent_type | desktop | |
render | - | |
callback_url | - | |
parse | - | |
context :
nfpr | true will turn off spelling auto-correction. | false |
context :
results_language | - | |
context :
tbm | To-be-matched or tbm parameter. Accepted values are: app , blg , bks , dsc , isch , nws , pts , plcs , rcp , lcl | - |
context :
tbs | tbs parameter. This parameter is like a container for more obscure google parameters, like limiting/sorting results by date as well as other filters, some of which depend on the tbm parameter (e.g. tbs=app_os:1 is only available with tbm value app ). More info here. | - |
- required parameter
In this example, we make a request to
google.nl
to retrieve search results for the keyword adidas.
JSON
cURL
Python
PHP
HTTP
{
"source": "google_ads",
"domain": "nl",
"query": "adidas",
"parse": true
}
curl --user user:pass1 'https://realtime.oxylabs.io/v1/queries' -H "Content-Type: application/json" -d '{
"source": "google_ads",
"domain": "nl",
"query": "adidas",
"parse": true
}'
import requests
from pprint import pprint
# Structure payload.
payload = {
'source': 'google_ads',
'domain': 'nl',
'query': 'adidas'
}
# Get response.
response = requests.request(
'POST',
'https://realtime.oxylabs.io/v1/queries',
auth=('user', 'pass1'),
json=payload,
)
# Print prettified response to stdout.
pprint(response.json())
<?php
$params = [
'source' => 'google_ads',
'domain' => 'nl',
'query' => 'adidas'
];
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "https://realtime.oxylabs.io/v1/queries");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($params));
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_USERPWD, "user" . ":" . "pass1");
$headers = [];
$headers[] = "Content-Type: application/json";
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
$result = curl_exec($ch);
echo $result;
if (curl_errno($ch)) {
echo 'Error:' . curl_error($ch);
}
curl_close ($ch);
https://realtime.oxylabs.io/v1/queries?source=google_ads&domain=nl&query=adidas
The example above uses the Realtime integration method. If you would like to use some other integration method in your query (e.g. Push-Pull or Proxy Endpoint), refer to the integration methods section.
The
google_hotels
data source is designed to retrieve Google Hotel search results.Parameter | Description | Default Value |
---|---|---|
source | google_hotels | |
domain | Domain localization | com |
query | UTF-encoded keyword | - |
start_page | Starting page number | 1 |
pages | Number of pages to retrieve. | 1 |
limit | Number of results to retrieve in each page. | 10 |
locale | Accept-Language header value which changes your Google search page web interface language. More info. | - |
results_language | - | |
geo_location | The geographical location that the result should be adapted for. Using this parameter correctly is extremely important to get the right data. For more information, read about our suggested geo_location parameter structures here. | - |
user_agent_type | desktop | |
render | | |
callback_url | - | |
context :
nfpr | true will turn off spelling auto-correction. | false |
context :
hotel_occupancy | Number of guests. | 2 |
context :
hotel_dates | Length for staying in the hotel, from - to. Example: 2023-07-12,2023-07-13 | - |
- required parameter
With Google hotels, you always need to send a keyword with 'hotels' word inside, for example, 'hotels in Los Angeles', 'hotels in Paris, France', etc. Both 'hotel' and 'hotels' work. Google also supports local languages to send 'Hotelli Helsingissä' for hotels in Helsinki or 'viešbučiai Vilnius' for hotels in Vilnius.
In this example, we make a request to retrieve the first
3
pages of hotel availability for 1
guest between 2023-10-01
and 2023-10-10
for hotels in Paris
from google.com
.sons.JSON
cURL
Python
PHP
HTTP
{
"source": "google_hotels",
"domain": "com",
"pages": 3,
"query": "hotels in Paris",
"context": [
{
"key": "hotel_occupancy",
"value": 1
},
{
"key": "hotel_dates",
"value": "2023-10-01,2023-10-10"
}]
}
curl --user user:pass1 'https://realtime.oxylabs.io/v1/queries' -H "Content-Type: application/json" -d '{"source": "google_hotels", "domain": "com", "pages": 3, "query": "hotels in Paris", "context": [{"key": "hotel_occupancy", "value": 1}, {"key": "hotel_dates", "value": "2023-10-01,2023-10-10"}]}'
import requests
from pprint import pprint
# Structure payload.
payload = {
'source': 'google_hotels',
'domain': 'com',
'query': 'hotels in Paris',
'pages': 3,
'context': [
{'key': 'hotel_occupancy', 'value': 1},
{'key': 'hotel_dates', 'value': '2023-10-01,2023-10-10'},
],
}
# Get response.
response = requests.request(
'POST',
'https://realtime.oxylabs.io/v1/queries',
auth=('user', 'pass1'),
json=payload,
)
# Print prettified response to stdout.
pprint(response.json())
<?php
$params = [
'source' => 'google_hotels',
'domain' => 'com',
'query' => 'hotels in Paris',
'pages' => 3,
'context' => [
[
'key' => 'hotel_occupancy',
'value' => 1,
],
[
'key' => 'hotel_dates',
'value' => '2023-10-01,2023-10-10',
]
]
];
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "https://realtime.oxylabs.io/v1/queries");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($params));
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_USERPWD, "user" . ":" . "pass1");
$headers = [];
$headers[] = "Content-Type: application/json";
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
$result = curl_exec($ch);
echo $result;
if (curl_errno($ch)) {
echo 'Error:' . curl_error($ch);
}
curl_close($ch);
https://realtime.oxylabs.io/v1/queries?source=google_hotels&domain=com&query=hotels+in+Paris&pages=3&context[0][key]=hotel_occupancy&context[0][value]=1&context[1][key]=hotel_dates&context[1][value]=2023-10-01,2023-10-10&access_token=12345abcde
The example above uses the Realtime integration method. If you would like to use some other integration method in your query (e.g. Push-Pull or Proxy Endpoint), refer to the integration methods section.
The
google_travel_hotels
data source is designed to retrieve Google Travel service's hotel search results.Parameter | Description | Default Value |
---|---|---|
source | google_travel_hotels | |
domain | Domain localization | com |
query | UTF-encoded keyword.
"query": "hotels" will result list hotels in a given geo_location ;
"query": "hotels in <Location>" will result in a list hotels for <Location> . Eg hotels in Paris will list hotels in Paris, no matter what geo_location is given. | - |
start_page | Starting page number | 1 |
locale | Accept-Language header value which changes your Google search page web interface language. More info. | - |
geo_location | The geographical location that the result should be adapted for. Using this parameter correctly is extremely important to get the right data. Please note that this source can accept a limited number of geo_location values - please check this section to see geo_location values that don't yield accurate results. | - |
user_agent_type | desktop | |
render | - | |
callback_url | - | |
context :
hotel_occupancy | Number of guests | 2 |
context :
hotel_classes | Filter results by # of hotel stars. You may specify one or more values between 2 and 5 . Example: [3,4] | - |
context :
hotel_dates | Dates for staying at the hotel, from - to. Example: 2023-07-12,2023-07-13 | - |
- required parameter
NOTE:
"geo_location": "United States"
and other wide-area locations are not supported. Use city-level geo_location, e.g., Seattle,Washington,United States
In this example, we make a request to retrieve the
2
nd results page for hotel availability for 2
guests between 2023-10-01
and 2023-10-10
for 2
to 4
-star hotels in Paris from google.com.JSON
cURL
Python
PHP
HTTP
{
"source": "google_travel_hotels",
"domain": "com",
"start_page": 2,
"query": "hotels in Paris",
"context": [
{
"key": "hotel_occupancy",
"value": 2
},
{
"key": "hotel_dates",
"value": "2023-10-01,2023-10-10"
},
{
"key": "hotel_classes",
"value": [2,3,4]
}]
}
curl --user user:pass1 'https://realtime.oxylabs.io/v1/queries' -H "Content-Type: application/json" -d '{"source": "google_travel_hotels", "domain": "com", "start_page": 2, "query": "hotels in Paris", "context": [{"key": "hotel_occupancy", "value": 2}, {"key": "hotel_dates", "value": "2023-10-01,2023-10-10"}, {"key": "hotel_classes", "value": [2,3,4]}]}'
import requests
from pprint import pprint
# Structure payload.
payload = {
'source': 'google_travel_hotels',
'domain': 'com',
'query': 'hotels in Paris',
'start_page': 2,
'context': [
{'key': 'hotel_occupancy', 'value': 1},
{'key': 'hotel_dates', 'value': '2023-10-01,2023-10-10'},
{'key': 'hotel_classes', 'value': [2,3,4]},
],
}
# Get response.
response = requests.request(
'POST',
'https://realtime.oxylabs.io/v1/queries',
auth=('user', 'pass1'),
json=payload,
)
# Print prettified response to stdout.
pprint(response.json())
<?php
$params = [
'source' => 'google_travel_hotels',
'domain' => 'com',
'query' => 'hotels in Paris',
'start_page' => 2,
'context' => [
[
'key' => 'hotel_occupancy',
'value' => 1,
],
[
'key' => 'hotel_dates',