# Scheduler

[**Scheduler**](https://oxylabs.io/features/scheduler) is a **free feature** of Web Scraper API that lets you automate recurring scraping and parsing jobs by creating schedules.&#x20;

Check out the video tutorial below to learn more about Scheduler and how it works.

{% embed url="<https://www.youtube.com/watch?v=HJLkFZ_9Z5w>" %}
Step-by-step guide on automating your recurring scraping jobs using Scheduler
{% endembed %}

We advise using Scheduler together with the [**Upload to Cloud Storage**](https://developers.oxylabs.io/scraping-solutions/web-scraper-api/features/result-processing-and-storage/cloud-storage) feature. This way, you can set up your schedule and receive regular data updates in your storage without trying to fetch results from our system.

{% hint style="warning" %}
**IMPORTANT**: Scheduler is a powerful tool that can quickly raise your service bill. We advise testing it with a few job items and a limited number of repeats to ensure you get the correct data at the right intervals. Once that's established, you can stop the test schedule and create a new, scaled-up schedule.
{% endhint %}

## Quick Start

When creating a new schedule, follow the simple steps below.

1. Tell us **how often we should repeat the jobs** by submitting a cron schedule expression;
2. Give us **a bunch of job parameter sets** that we should execute at scheduled times;
3. Let us know **when to stop** by submitting an end time.

&#x20;See [**here**](#create-a-new-schedule) to find a code example for submitting a new schedule.

{% hint style="info" %}
**NOTE**: You can also download and import [**this Postman collection**](https://files.gitbook.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FiwDdoZGfMbUe5cRL2417%2Fuploads%2FsipIxRVxZmZroadwSuqg%2FScheduler.postman_collection.json?alt=media\&token=dac92525-e5cc-43c2-a8eb-a32e8eba0483) to try out all our Scheduler endpoints. New to Postman? Learn more about this tool [**here**](https://developers.oxylabs.io/guides-for-scraper-apis/using-postman).
{% endhint %}

## Endpoints

Scheduler has several endpoints you can use to control the service:&#x20;

* [**Create a new schedule**](#1.-create-a-new-schedule)
* [**Get all schedules**](#get-all-schedules)
* [**Get** **runs** **information**](#get-runs-information)
* [**Get scheduled jobs**](#get-scheduled-jobs)
* [**Get schedule information**](#get-schedule-info)
* [**Deactivate or reactivate a schedule**](#change-schedule-state)

### Create a new schedule

#### Overview

Use this endpoint to initiate a new schedule.

* **Endpoint**: `https://data.oxylabs.io/v1/schedules`
* **Method**: `POST`
* **Authentication**: `Basic`
* **Request headers**: `Content-Type: application/json`

**Input**

<table><thead><tr><th width="150">Parameter</th><th width="435.3333333333333">Description</th><th>Default Value</th></tr></thead><tbody><tr><td> <mark style="background-color:green;"><strong><code>cron</code></strong></mark> </td><td>Cron schedule expression. It determines how often the submitted schedule will run. Read more <a href="https://crontab.guru/"><strong>here</strong></a> and <a href="https://docs.oracle.com/cd/E12058_01/doc/doc.1014/e12030/cron_expressions.htm"><strong>here</strong></a>.</td><td>-</td></tr><tr><td> <mark style="background-color:green;"><strong><code>items</code></strong></mark> </td><td>List of Scraper APIs job parameter sets that should be executed as part of the schedule.</td><td>-</td></tr><tr><td><mark style="background-color:green;"><strong><code>end_time</code></strong></mark> </td><td>The time at which the schedule should stop running. NB: the end time is inclusive.</td><td>-</td></tr></tbody></table>

&#x20;    \- required parameter

{% hint style="info" %}
**NOTE**: For guidance on putting together job parameter sets for the **`items`** part of your Scheduler payload, refer to the documentation page of the particular scraper you would like to use (e.g. [**Google**](https://developers.oxylabs.io/scraping-solutions/web-scraper-api/targets/google), [**Amazon**](https://developers.oxylabs.io/scraping-solutions/web-scraper-api/targets/amazon), etc.).
{% endhint %}

The payload below will make Scheduler run two jobs schedule at 03:00 on Mondays until `end_time` (inclusive).

```json
{
  "cron": "0 3 * * 1",
  "items": [
    {"source": "universal", "url": "https://ip.oxylabs.io"},
    {"source": "google_search", "query": "stuff"}
  ],
  "end_time": "2032-12-21 12:34:45"
}
```

#### Output

The response below confirms that the schedule was created successfully.

```json
{
    "schedule_id": 4134906379157007223,
    "active": true,
    "items_count": 3,
    "cron": "0 3 * * 1",
    "end_time": "2032-12-21 12:34:45",
    "next_run_at": "2022-06-06 10:15:00"
}
```

### Get all schedules

#### Overview

Use this endpoint to get the list of all schedules associated with your user account.

* **Endpoint**: `https://data.oxylabs.io/v1/schedules`
* **Method**: `GET`
* **Authentication**: `Basic`

#### Output

This endpoint returns the list of all schedule IDs associated with the user account making the request.

See the sample response below.

```json
{
    "schedules": [
        1764178033254455101,
        2885262175311057587,
        3251365810325795747,
        4134906379157007223,
        4164931482277157062
    ]
}
```

### Get runs information

#### Overview

Use this endpoint to get information about a list of all runs in a schedule with the metadata of each job and each run’s success rate.

* **Endpoint**: `https://data.oxylabs.io/v1/schedules/{id}/runs`
* **Method**: `GET`
* **Authentication**: `Basic`

#### Output

The payload below contains a sample `/runs` endpoint response.

```javascript
{
 "runs": [
        {
            "run_id": 25037485,
            "jobs": [
                {
                    "id": 7300439540206948353,
                    "create_status_code": 202,
                    "result_status": "done",
                    "created_at": "2025-02-26 09:00:21",
                    "result_created_at": "2025-02-26 09:00:23"
                },
                {
                    "id": 7300439540169188353,
                    "create_status_code": 202,
                    "result_status": "done",
                    "created_at": "2025-02-26 09:00:21",
                    "result_created_at": "2025-02-26 09:00:22"
                },
                {
                    "id": 7300439540198551553,
                    "create_status_code": 202,
                    "result_status": "done",
                    "created_at": "2025-02-26 09:00:21",
                    "result_created_at": "2025-02-26 09:00:23"
                }
            ],
            "success_rate": 1
        }
}
```

<table><thead><tr><th width="216">Key</th><th>Description</th><th>Type</th></tr></thead><tbody><tr><td><code>runs</code></td><td>A collection of run objects that represent execution instances of a scheduled task or workflow.</td><td>Array</td></tr><tr><td><code>runs</code>:<code>run_id</code></td><td>A unique identifier for the specific run instance.</td><td>Integer</td></tr><tr><td><code>runs</code>:<code>jobs</code></td><td>A collection of job objects that were executed as part of this run.</td><td>Array</td></tr><tr><td><code>runs</code>:<code>success_rate</code></td><td>The ratio of successful jobs to total jobs in this run (ranges from 0 to 1).</td><td>Number</td></tr><tr><td><code>runs</code>:<code>jobs</code>:<code>id</code></td><td>A unique Oxylabs identifier for the specific job.</td><td>Integer</td></tr><tr><td><code>runs</code>:<code>jobs</code>:<code>create_status_code</code></td><td>HTTP status code returned when the job was created, indicating the initial acceptance of the job request.</td><td>Integer</td></tr><tr><td><code>runs</code>:<code>jobs</code>:<code>result_status</code></td><td>The execution status of the job (e.g., "done", "failed", "pending").</td><td>String</td></tr><tr><td><code>runs</code>:<code>jobs</code>:<code>created_at</code></td><td>Timestamp when the job was created</td><td>String</td></tr><tr><td><code>runs</code>:<code>jobs</code>:<code>result_created_at</code></td><td>Timestamp when the job completed and produced a result</td><td>String</td></tr></tbody></table>

### Get scheduled jobs

#### Overview

Use this endpoint to get the list of scraping jobs executed as a result of running a schedule.&#x20;

* **Endpoint**: `https://data.oxylabs.io/v1/schedules/{id}/jobs`
* **Method**: `GET`
* **Authentication**: `Basic`

#### Output

The payload below contains a sample schedule info response.

```json
{
    "jobs": [
        7300439540206948353,
        7300439540169188353,
        7300439540198551553,
        ...,
      ]
}
```

### Get schedule information

#### Overview

Use this endpoint to get information about a specific schedule.&#x20;

* **Endpoint**: `https://data.oxylabs.io/v1/schedules/{id}`
* **Method**: `GET`
* **Authentication**: `Basic`

#### Output

The payload below contains a sample schedule info response.

<pre class="language-json"><code class="lang-json">{
    "schedule_id": 1764178033254455101,
    "active": true,
    "items_count": 3,
    "cron": "0 3 * * 1",
    "end_time": "2032-12-21 12:34:45",
    "next_run_at": "2022-06-06 10:18:00",
<strong>    "links": [
</strong>        {
            "rel": "runs",
            "href": "/v1/schedules/1764178033254455101/runs",
            "method": "GET"
        },
        {
            "rel": "jobs",
            "href": "/v1/schedules/1764178033254455101/jobs",
            "method": "GET"
        }
    ],
    "stats": {
        "total_job_count": 3,
        "job_create_outcomes": [
            {
                "status_code": 202,
                "job_count": 3,
                "ratio": 1
            }
        ],
        "job_result_outcomes": [
            {
                "status": "done",
                "job_count": 2,
                "ratio": 0.67
            },
            {
                "status": "faulted",
                "job_count": 1,
                "ratio": 0.33
            }
        ]
    }
}
</code></pre>

<table><thead><tr><th width="216">Key</th><th>Description</th><th>Type</th></tr></thead><tbody><tr><td><code>schedule_id</code></td><td>The unique ID of the schedule.</td><td>Integer</td></tr><tr><td><code>active</code></td><td>Is the schedule active right now?</td><td>Boolean</td></tr><tr><td><code>items_count</code></td><td>The number of items (jobs) in the schedule.</td><td>Integer</td></tr><tr><td><code>cron</code></td><td>The cron expression associated with the schedule.</td><td>String</td></tr><tr><td><code>end_time</code></td><td>The time upon which the schedule will stop being repeated.</td><td>String</td></tr><tr><td><code>next_run_at</code></td><td>The time upon which the schedule will run next.</td><td>String</td></tr><tr><td><code>links</code></td><td>A collection of link objects that define available API endpoints related to a schedule resource.</td><td>Array</td></tr><tr><td><code>links</code>:<code>rel</code></td><td>The relationship identifier that explains the purpose of the link relative to the parent resource.</td><td>String</td></tr><tr><td><code>links</code>:<code>href</code></td><td>The URL path to the API endpoint. Represents the resource location that can be accessed.</td><td>String</td></tr><tr><td><code>links</code>:<code>method</code></td><td>The HTTP method to be used when accessing this endpoint.</td><td>String</td></tr><tr><td><code>stats</code></td><td>Contains stats job creation and job completion statistics.</td><td>JSON Object</td></tr><tr><td><code>stats</code>:<code>total_job_count</code></td><td>The number of items (jobs) in the schedule.</td><td>Integer</td></tr><tr><td><code>stats</code>:<code>job_create_outcomes</code></td><td>Contains job creation statistics.</td><td>JSON Array</td></tr><tr><td><code>stats</code>:<code>job_create_outcomes</code>:<code>status_code</code></td><td>The status code received in response to an attempt to execute the schedule (create a scraping/parsing job).</td><td>Integer</td></tr><tr><td><code>stats</code>:<code>job_create_outcomes</code>:<code>job_count</code></td><td>The number of job creation attempts that resulted in that particular status code.</td><td>Integer</td></tr><tr><td><code>stats</code>:<code>job_create_outcomes</code>:<code>ratio</code></td><td>The ratio between the number of job creation attempts that resulted in that particular attempt and the total number of job creation attempts.</td><td>Float</td></tr><tr><td><code>job_result_outcomes</code></td><td>Contains the outcome statistics of scraping/parsing jobs executed as part of the schedule.</td><td>JSON Array</td></tr><tr><td><code>status</code></td><td>The job status. Possible values: <code>pending</code> (the job is still being processed), <code>done</code> (the job has been completed successfully), <code>faulted</code> (the job has failed).</td><td>String</td></tr><tr><td><code>job_count</code></td><td>The number of jobs that resulted in that particular <code>status</code>.</td><td>Integer</td></tr><tr><td><code>ratio</code></td><td>The ratio between the number of jobs with that particular status and the total number of jobs created.</td><td>Float</td></tr></tbody></table>

### Deactivate or reactivate a schedule

#### Overview

Use this endpoint to activate or deactivate a particular schedule.

* **Endpoint**: `https://data.oxylabs.io/v1/schedules/{id}/state`
* **Method**: `PUT`
* **Authentication**: `Basic`

#### Input

Use this endpoint to stop or restart a schedule.&#x20;

By setting `active` to `false`, you can stop the execution of a particular schedule.&#x20;

If you set `active` to `true`, you can reactivate a previously stopped schedule.

```json
{
  "active": false
}
```

**Output**

```json
null
```

The standard response is an empty response body with a `202` status code.

## API response codes

For API response codes, refer to [**API**](https://developers.oxylabs.io/scraping-solutions/response-codes#api) section.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://developers.oxylabs.io/scraping-solutions/web-scraper-api/features/scheduler.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
