Realtime
Realtime integration for the Web Scraper API by Oxylabs. Keep the HTTPS connection open from job submission until results or an error are returned, using JSON-formatted payloads.
Realtime is a synchronous integration method. It requires keeping the connection open until the job is finished successfully or returns an error.
Job Submission
Endpoint
The Realtime API endpoint for job submission is:
POST https://realtime.oxylabs.io/v1/queriesInput
Provide the job parameters in a JSON payload as shown in the examples below. Python and PHP examples include comments for clarity.
curl --user "USERNAME:PASSWORD" \
'https://realtime.oxylabs.io/v1/queries' \
-H "Content-Type: application/json" \
-d '{"source": "universal", "url": "https://example.com", "geo_location": "United States"}'import requests
from pprint import pprint
# Structure payload.
payload = {
"source": "universal", # Source you choose e.g. "universal"
"url": "https://example.com", # Check the docs of the specific source you're using to see if you should use "url" or "query"
"geo_location": "United States", # Some sources accept post codes and/or coordinates
#"render" : "html", # Uncomment if you want to render JavaScript on the page
#"render" : "png", # Uncomment if you want to take a screenshot of a scraped web page
#"parse" : true, # Check what sources support parsed data
}
# Get response.
response = requests.request(
'POST',
'https://realtime.oxylabs.io/v1/queries',
auth=('YOUR_USERNAME', 'YOUR_PASSWORD'), #Your credentials go here
json=payload,
)
# Instead of response with job status and results url, this will return the
# JSON response with results.
pprint(response.json())Output
Realtime API supports these result types in the output:
HTML: The raw HTML content scraped from the target webpage;
JSON: Structured data parsed from the HTML content, formatted in JSON format;
PNG: Base64-encoded screenshot of the rendered page in PNG format.
XHR: XHR requests made while loading the page.
Markdown: Markdown of a web page.
This table explains the default and other available result types based on the parameters included in the payload of the API request.
x
x
html
html
html
x
html
html
png
x
png
html, png
x
true
json
html, json
html
true
json
html, json
png
true
png
html, json, png
Output example:
Last updated
Was this helpful?

