# Scheduler

[**Scheduler**](https://oxylabs.io/features/scheduler) is a **free feature** of Web Scraper API that lets you automate recurring scraping and parsing jobs by creating schedules.&#x20;

Check out the video tutorial below to learn more about Scheduler and how it works.

{% embed url="<https://www.youtube.com/watch?v=HJLkFZ_9Z5w>" %}
Step-by-step guide on automating your recurring scraping jobs using Scheduler
{% endembed %}

We advise using Scheduler together with the [**Upload to Cloud Storage**](https://developers.oxylabs.io/scraping-solutions/web-scraper-api/features/result-processing-and-storage/cloud-storage) feature. This way, you can set up your schedule and receive regular data updates in your storage without trying to fetch results from our system.

{% hint style="warning" %}
**IMPORTANT**: Scheduler is a powerful tool that can quickly raise your service bill. We advise testing it with a few job items and a limited number of repeats to ensure you get the correct data at the right intervals. Once that's established, you can stop the test schedule and create a new, scaled-up schedule.
{% endhint %}

## Quick Start

When creating a new schedule, follow the simple steps below.

1. Tell us **how often we should repeat the jobs** by submitting a cron schedule expression;
2. Give us **a bunch of job parameter sets** that we should execute at scheduled times;
3. Let us know **when to stop** by submitting an end time.

&#x20;See [**here**](#create-a-new-schedule) to find a code example for submitting a new schedule.

{% hint style="info" %}
**NOTE**: You can also download and import [**this Postman collection**](https://files.gitbook.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FiwDdoZGfMbUe5cRL2417%2Fuploads%2FsipIxRVxZmZroadwSuqg%2FScheduler.postman_collection.json?alt=media\&token=dac92525-e5cc-43c2-a8eb-a32e8eba0483) to try out all our Scheduler endpoints. New to Postman? Learn more about this tool [**here**](https://developers.oxylabs.io/guides-for-scraper-apis/using-postman).
{% endhint %}

## Endpoints

Scheduler has several endpoints you can use to control the service:&#x20;

* [**Create a new schedule**](#1.-create-a-new-schedule)
* [**Get all schedules**](#get-all-schedules)
* [**Get** **runs** **information**](#get-runs-information)
* [**Get scheduled jobs**](#get-scheduled-jobs)
* [**Get schedule information**](#get-schedule-info)
* [**Deactivate or reactivate a schedule**](#change-schedule-state)

### Create a new schedule

#### Overview

Use this endpoint to initiate a new schedule.

* **Endpoint**: `https://data.oxylabs.io/v1/schedules`
* **Method**: `POST`
* **Authentication**: `Basic`
* **Request headers**: `Content-Type: application/json`

**Input**

<table><thead><tr><th width="150">Parameter</th><th width="435.3333333333333">Description</th><th>Default Value</th></tr></thead><tbody><tr><td> <mark style="background-color:green;"><strong><code>cron</code></strong></mark> </td><td>Cron schedule expression. It determines how often the submitted schedule will run. Read more <a href="https://crontab.guru/"><strong>here</strong></a> and <a href="https://docs.oracle.com/cd/E12058_01/doc/doc.1014/e12030/cron_expressions.htm"><strong>here</strong></a>.</td><td>-</td></tr><tr><td> <mark style="background-color:green;"><strong><code>items</code></strong></mark> </td><td>List of Scraper APIs job parameter sets that should be executed as part of the schedule.</td><td>-</td></tr><tr><td><mark style="background-color:green;"><strong><code>end_time</code></strong></mark> </td><td>The time at which the schedule should stop running. NB: the end time is inclusive.</td><td>-</td></tr></tbody></table>

&#x20;    \- required parameter

{% hint style="info" %}
**NOTE**: For guidance on putting together job parameter sets for the **`items`** part of your Scheduler payload, refer to the documentation page of the particular scraper you would like to use (e.g. [**Google**](https://developers.oxylabs.io/scraping-solutions/web-scraper-api/targets/google), [**Amazon**](https://developers.oxylabs.io/scraping-solutions/web-scraper-api/targets/amazon), etc.).
{% endhint %}

The payload below will make Scheduler run two jobs schedule at 03:00 on Mondays until `end_time` (inclusive).

```json
{
  "cron": "0 3 * * 1",
  "items": [
    {"source": "universal", "url": "https://ip.oxylabs.io"},
    {"source": "google_search", "query": "stuff"}
  ],
  "end_time": "2032-12-21 12:34:45"
}
```

#### Output

The response below confirms that the schedule was created successfully.

```json
{
    "schedule_id": 4134906379157007223,
    "active": true,
    "items_count": 3,
    "cron": "0 3 * * 1",
    "end_time": "2032-12-21 12:34:45",
    "next_run_at": "2022-06-06 10:15:00"
}
```

### Get all schedules

#### Overview

Use this endpoint to get the list of all schedules associated with your user account.

* **Endpoint**: `https://data.oxylabs.io/v1/schedules`
* **Method**: `GET`
* **Authentication**: `Basic`

#### Output

This endpoint returns the list of all schedule IDs associated with the user account making the request.

See the sample response below.

```json
{
    "schedules": [
        1764178033254455101,
        2885262175311057587,
        3251365810325795747,
        4134906379157007223,
        4164931482277157062
    ]
}
```

### Get runs information

#### Overview

Use this endpoint to get information about a list of all runs in a schedule with the metadata of each job and each run’s success rate.

* **Endpoint**: `https://data.oxylabs.io/v1/schedules/{id}/runs`
* **Method**: `GET`
* **Authentication**: `Basic`

#### Output

The payload below contains a sample `/runs` endpoint response.

```javascript
{
 "runs": [
        {
            "run_id": 25037485,
            "jobs": [
                {
                    "id": 7300439540206948353,
                    "create_status_code": 202,
                    "result_status": "done",
                    "created_at": "2025-02-26 09:00:21",
                    "result_created_at": "2025-02-26 09:00:23"
                },
                {
                    "id": 7300439540169188353,
                    "create_status_code": 202,
                    "result_status": "done",
                    "created_at": "2025-02-26 09:00:21",
                    "result_created_at": "2025-02-26 09:00:22"
                },
                {
                    "id": 7300439540198551553,
                    "create_status_code": 202,
                    "result_status": "done",
                    "created_at": "2025-02-26 09:00:21",
                    "result_created_at": "2025-02-26 09:00:23"
                }
            ],
            "success_rate": 1
        }
}
```

<table><thead><tr><th width="216">Key</th><th>Description</th><th>Type</th></tr></thead><tbody><tr><td><code>runs</code></td><td>A collection of run objects that represent execution instances of a scheduled task or workflow.</td><td>Array</td></tr><tr><td><code>runs</code>:<code>run_id</code></td><td>A unique identifier for the specific run instance.</td><td>Integer</td></tr><tr><td><code>runs</code>:<code>jobs</code></td><td>A collection of job objects that were executed as part of this run.</td><td>Array</td></tr><tr><td><code>runs</code>:<code>success_rate</code></td><td>The ratio of successful jobs to total jobs in this run (ranges from 0 to 1).</td><td>Number</td></tr><tr><td><code>runs</code>:<code>jobs</code>:<code>id</code></td><td>A unique Oxylabs identifier for the specific job.</td><td>Integer</td></tr><tr><td><code>runs</code>:<code>jobs</code>:<code>create_status_code</code></td><td>HTTP status code returned when the job was created, indicating the initial acceptance of the job request.</td><td>Integer</td></tr><tr><td><code>runs</code>:<code>jobs</code>:<code>result_status</code></td><td>The execution status of the job (e.g., "done", "failed", "pending").</td><td>String</td></tr><tr><td><code>runs</code>:<code>jobs</code>:<code>created_at</code></td><td>Timestamp when the job was created</td><td>String</td></tr><tr><td><code>runs</code>:<code>jobs</code>:<code>result_created_at</code></td><td>Timestamp when the job completed and produced a result</td><td>String</td></tr></tbody></table>

### Get scheduled jobs

#### Overview

Use this endpoint to get the list of scraping jobs executed as a result of running a schedule.&#x20;

* **Endpoint**: `https://data.oxylabs.io/v1/schedules/{id}/jobs`
* **Method**: `GET`
* **Authentication**: `Basic`

#### Output

The payload below contains a sample schedule info response.

```json
{
    "jobs": [
        7300439540206948353,
        7300439540169188353,
        7300439540198551553,
        ...,
      ]
}
```

### Get schedule information

#### Overview

Use this endpoint to get information about a specific schedule.&#x20;

* **Endpoint**: `https://data.oxylabs.io/v1/schedules/{id}`
* **Method**: `GET`
* **Authentication**: `Basic`

#### Output

The payload below contains a sample schedule info response.

<pre class="language-json"><code class="lang-json">{
    "schedule_id": 1764178033254455101,
    "active": true,
    "items_count": 3,
    "cron": "0 3 * * 1",
    "end_time": "2032-12-21 12:34:45",
    "next_run_at": "2022-06-06 10:18:00",
<strong>    "links": [
</strong>        {
            "rel": "runs",
            "href": "/v1/schedules/1764178033254455101/runs",
            "method": "GET"
        },
        {
            "rel": "jobs",
            "href": "/v1/schedules/1764178033254455101/jobs",
            "method": "GET"
        }
    ],
    "stats": {
        "total_job_count": 3,
        "job_create_outcomes": [
            {
                "status_code": 202,
                "job_count": 3,
                "ratio": 1
            }
        ],
        "job_result_outcomes": [
            {
                "status": "done",
                "job_count": 2,
                "ratio": 0.67
            },
            {
                "status": "faulted",
                "job_count": 1,
                "ratio": 0.33
            }
        ]
    }
}
</code></pre>

<table><thead><tr><th width="216">Key</th><th>Description</th><th>Type</th></tr></thead><tbody><tr><td><code>schedule_id</code></td><td>The unique ID of the schedule.</td><td>Integer</td></tr><tr><td><code>active</code></td><td>Is the schedule active right now?</td><td>Boolean</td></tr><tr><td><code>items_count</code></td><td>The number of items (jobs) in the schedule.</td><td>Integer</td></tr><tr><td><code>cron</code></td><td>The cron expression associated with the schedule.</td><td>String</td></tr><tr><td><code>end_time</code></td><td>The time upon which the schedule will stop being repeated.</td><td>String</td></tr><tr><td><code>next_run_at</code></td><td>The time upon which the schedule will run next.</td><td>String</td></tr><tr><td><code>links</code></td><td>A collection of link objects that define available API endpoints related to a schedule resource.</td><td>Array</td></tr><tr><td><code>links</code>:<code>rel</code></td><td>The relationship identifier that explains the purpose of the link relative to the parent resource.</td><td>String</td></tr><tr><td><code>links</code>:<code>href</code></td><td>The URL path to the API endpoint. Represents the resource location that can be accessed.</td><td>String</td></tr><tr><td><code>links</code>:<code>method</code></td><td>The HTTP method to be used when accessing this endpoint.</td><td>String</td></tr><tr><td><code>stats</code></td><td>Contains stats job creation and job completion statistics.</td><td>JSON Object</td></tr><tr><td><code>stats</code>:<code>total_job_count</code></td><td>The number of items (jobs) in the schedule.</td><td>Integer</td></tr><tr><td><code>stats</code>:<code>job_create_outcomes</code></td><td>Contains job creation statistics.</td><td>JSON Array</td></tr><tr><td><code>stats</code>:<code>job_create_outcomes</code>:<code>status_code</code></td><td>The status code received in response to an attempt to execute the schedule (create a scraping/parsing job).</td><td>Integer</td></tr><tr><td><code>stats</code>:<code>job_create_outcomes</code>:<code>job_count</code></td><td>The number of job creation attempts that resulted in that particular status code.</td><td>Integer</td></tr><tr><td><code>stats</code>:<code>job_create_outcomes</code>:<code>ratio</code></td><td>The ratio between the number of job creation attempts that resulted in that particular attempt and the total number of job creation attempts.</td><td>Float</td></tr><tr><td><code>job_result_outcomes</code></td><td>Contains the outcome statistics of scraping/parsing jobs executed as part of the schedule.</td><td>JSON Array</td></tr><tr><td><code>status</code></td><td>The job status. Possible values: <code>pending</code> (the job is still being processed), <code>done</code> (the job has been completed successfully), <code>faulted</code> (the job has failed).</td><td>String</td></tr><tr><td><code>job_count</code></td><td>The number of jobs that resulted in that particular <code>status</code>.</td><td>Integer</td></tr><tr><td><code>ratio</code></td><td>The ratio between the number of jobs with that particular status and the total number of jobs created.</td><td>Float</td></tr></tbody></table>

### Deactivate or reactivate a schedule

#### Overview

Use this endpoint to activate or deactivate a particular schedule.

* **Endpoint**: `https://data.oxylabs.io/v1/schedules/{id}/state`
* **Method**: `PUT`
* **Authentication**: `Basic`

#### Input

Use this endpoint to stop or restart a schedule.&#x20;

By setting `active` to `false`, you can stop the execution of a particular schedule.&#x20;

If you set `active` to `true`, you can reactivate a previously stopped schedule.

```json
{
  "active": false
}
```

**Output**

```json
null
```

The standard response is an empty response body with a `202` status code.

## API response codes

For API response codes, refer to [**API**](https://developers.oxylabs.io/scraping-solutions/response-codes#api) section.
