Realtime

Realtime integration for the Web Scraper API by Oxylabs. Keep the HTTPS connection open from job submission until results or an error are returned, using JSON-formatted payloads.

Realtime is a synchronous integration method. It requires keeping the connection open until the job is finished successfully or returns an error.

Job Submission

Endpoint

The Realtime API endpoint for job submission is:

POST https://realtime.oxylabs.io/v1/queries

Input

Provide the job parameters in a JSON payload as shown in the examples below. Python and PHP examples include comments for clarity.

curl --user "USERNAME:PASSWORD" \
'https://realtime.oxylabs.io/v1/queries' \
-H "Content-Type: application/json" \
-d '{"source": "universal", "url": "https://example.com", "geo_location": "United States"}'

import requests
from pprint import pprint


# Structure payload.
payload = {
    "source": "universal", # Source you choose e.g. "universal"
    "url": "https://example.com", # Check the docs of the specific source you're using to see if you should use "url" or "query"
    "geo_location": "United States", # Some sources accept post codes and/or coordinates
    #"render" : "html", # Uncomment if you want to render JavaScript on the page
    #"render" : "png", # Uncomment if you want to take a screenshot of a scraped web page
    #"parse" : true, # Check what sources support parsed data
}

# Get response.
response = requests.request(
    'POST',
    'https://realtime.oxylabs.io/v1/queries',
    auth=('YOUR_USERNAME', 'YOUR_PASSWORD'), #Your credentials go here
    json=payload,
)

# Instead of response with job status and results url, this will return the
# JSON response with results.
pprint(response.json())

<?php

$params = array(
    'source' => 'universal', //Source you choose e.g. "universal"
    'url' => 'https://example.com', // Check the docs of the specific source you're using to see if you should use "url" or "query"
    'geo_location' => 'United States', //Some sources accept zip-code or coordinates
    //'render' : 'html', // Uncomment if you want to render JavaScript within the page
    //'render' : 'png', // Uncomment if you want to take a screenshot of a scraped web page
    //'parse' : TRUE, // Check what sources support parsed data
);

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, "https://realtime.oxylabs.io/v1/queries");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($params));
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_USERPWD, "YOUR_USERNAME" . ":" . "YOUR_PASSWORD"); //Your credentials go here

$headers = array();
$headers[] = "Content-Type: application/json";
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);

$result = curl_exec($ch);
echo $result;

if (curl_errno($ch)) {
    echo 'Error:' . curl_error($ch);
}
curl_close ($ch);PHP

using System;
using System.Collections.Generic;
using System.Net.Http;
using System.Net.Http.Json;
using System.Threading.Tasks;

namespace OxyApi
{
    class Program
    {
        static async Task Main()
        {
            const string Username = "YOUR_USERNAME";
            const string Password = "YOUR_PASSWORD";

            var parameters = new Dictionary<string, string>()
            {
                { "source", "universal" },
                { "url", "https://example.com" },
                { "geo_location", "United States" },
            };


            var client = new HttpClient();

            Uri baseUri = new Uri("https://realtime.oxylabs.io");
            client.BaseAddress = baseUri;

            var requestMessage = new HttpRequestMessage(HttpMethod.Post, "/v1/queries");
            requestMessage.Content = JsonContent.Create(parameters);

            var authenticationString = $"{Username}:{Password}";
            var base64EncodedAuthenticationString = Convert.ToBase64String(System.Text.ASCIIEncoding.UTF8.GetBytes(authenticationString));
            requestMessage.Headers.Add("Authorization", "Basic " + base64EncodedAuthenticationString);

            var response = await client.SendAsync(requestMessage);
            var contents = await response.Content.ReadAsStringAsync();

            Console.WriteLine(contents);
        }
    }
}

package main

import (
	"bytes"
	"encoding/json"
	"fmt"
	"io/ioutil"
	"net/http"
)

func main() {
	const Username = "YOUR_USERNAME"
	const Password = "YOUR_PASSWORD"

	payload := map[string]string{
		"source": "universal",
		"url": "https://example.com",
		"geo_location": "United States",
	}

	jsonValue, _ := json.Marshal(payload)

	client := &http.Client{}
	request, _ := http.NewRequest("POST",
		"https://realtime.oxylabs.io/v1/queries",
		bytes.NewBuffer(jsonValue),
	)

	request.SetBasicAuth(Username, Password)
	response, _ := client.Do(request)

	responseText, _ := ioutil.ReadAll(response.Body)
	fmt.Println(string(responseText))
}

package org.example;

import okhttp3.*;
import org.json.JSONObject;

public class Main implements Runnable {
    private static final String AUTHORIZATION_HEADER = "Authorization";
    public static final String USERNAME = "YOUR_USERNAME";
    public static final String PASSWORD = "YOUR_PASSWORD";

    public void run() {
        JSONObject jsonObject = new JSONObject();
        jsonObject.put("source", "universal");
        jsonObject.put("url", "https://example.com");
        jsonObject.put("geo_location", "United States");

        Authenticator authenticator = (route, response) -> {
            String credential = Credentials.basic(USERNAME, PASSWORD);

            return response
                    .request()
                    .newBuilder()
                    .header(AUTHORIZATION_HEADER, credential)
                    .build();
        };

        var client = new OkHttpClient.Builder()
                .authenticator(authenticator)
                .build();

        var mediaType = MediaType.parse("application/json; charset=utf-8");
        var body = RequestBody.create(jsonObject.toString(), mediaType);
        var request = new Request.Builder()
                .url("https://realtime.oxylabs.io/v1/queries")
                .post(body)
                .build();

        try (var response = client.newCall(request).execute()) {
            assert response.body() != null;
            System.out.println(response.body().string());
        } catch (Exception exception) {
            System.out.println("Error: " + exception.getMessage());
        }

        System.exit(0);
    }

    public static void main(String[] args) {
        new Thread(new Main()).start();
    }
}

import fetch from 'node-fetch';

const username = 'YOUR_USERNAME';
const password = 'YOUR_PASSWORD';
const body = {
  source: 'universal',
  url: 'https://example.com',
  geo_location: 'United States'
};
const response = await fetch('https://realtime.oxylabs.io/v1/queries', {
  method: 'post',
  body: JSON.stringify(body),
  headers: {
    'Content-Type': 'application/json',
    'Authorization': 'Basic ' + Buffer.from(`${username}:${password}`).toString('base64'),
  }
});

console.log(await response.json());

Output

Realtime API supports these result types in the output:

HTML: The raw HTML content scraped from the target webpage;
JSON: Structured data parsed from the HTML content, formatted in JSON format;
PNG: Base64-encoded screenshot of the rendered page in PNG format.
XHR: XHR requests made while loading the page.
Markdown: Markdown of a web page.

You can also retrieve multiple result types in a single API response.

This table explains the default and other available result types based on the parameters included in the payload of the API request.

Render parameter

Parse parameter

Default output

Available output

html

html

html

png

png

html, png

true

json

html, json

html

true

json

html, json

png

true

png

html, json, png

Realtime API always returns the default output. To get other available outputs, use {Push-Pull} integration method.

Output example:

{
  "results": [
    {
      "content": "<html>
      CONTENT
      </html>",
      "created_at": "2024-06-26 13:13:06",
      "updated_at": "2024-06-26 13:13:07",
      "id": null,
      "page": 1,
      "url": "https://www.example.com/",
      "job_id": "12345678900987654321",
      "status_code": 200
    }
  ]
}

PreviousIntegration Methods NextPush-Pull

Last updated 26 days ago

Was this helpful?

Good evening

hashtagJob Submission

hashtagEndpoint

hashtagInput

hashtagOutput

hashtagOutput example:

Job Submission

Endpoint

Input

Output

Output example: