# Cloud Storage

Scraper API job results are stored in our storage. You can get your results from our storage by `GET`ting the `/results` endpoint.

As an alternative, we can upload the results to your cloud storage. This way, you don't have to make extra requests to fetch results – everything goes directly to your storage bucket.

{% hint style="info" %}
Cloud storage integration works only with [**Push-Pull**](https://developers.oxylabs.io/scraping-solutions/web-scraper-api/integration-methods/push-pull) integration method.
{% endhint %}

Currently, we support these cloud storage services:

* [**Google Cloud Storage**](#google-cloud-storage)
* [**Amazon S3**](#amazon-s3)
* [**Alibaba Cloud Object Storage Service (OSS)**](#alibaba-cloud-object-storage-service-oss)
* [**BytePlus Torch Object Storage (TOS)**](#byteplus-tos)
* [**Other S3-compatible storage**](#other-s3-compatible-storage)

If you'd like to use a different type of storage, please contact your account manager to discuss the feature delivery timeline.

The upload path looks like this: `YOUR_BUCKET_NAME/job_ID.json`. You'll find the job ID in the response that you receive from us after submitting a job.

#### Input

<table><thead><tr><th>Parameter</th><th width="187.33333333333331">Description</th><th>Valid values</th></tr></thead><tbody><tr><td><code>storage_type</code></td><td>Your cloud storage type.</td><td><p><code>gcs</code> (Google Cloud Storage);</p><p><code>s3</code> (AWS S3); <code>tos</code> (BytePlus TOS); </p><p><code>s3_compatible</code> (any S3-compatible storage).</p></td></tr><tr><td><code>storage_url</code></td><td>Your cloud storage bucket name / URL.</td><td><ul><li>Any <code>s3</code> , <code>gcs</code> , or <code>tos</code> bucket name;</li><li>Any <code>s3-compatible</code> storage URL.</li></ul></td></tr></tbody></table>

## **Google Cloud Storage**

The payload below makes Web Scraper API scrape `https://example.com` and put the result on a Google Cloud Storage bucket.

```json
{
    "source": "universal",
    "query": "https://example.com",
    "storage_type": "gcs",
    "storage_url": "bucket_name/path"
}
```

To get your job results uploaded to your Google Cloud Storage bucket, please set up special permissions for our service as shown below:

{% stepper %}
{% step %}
**Create a custom role**

<div align="left"><img src="https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2FpMD51jtIU5SRROKn44uU%2Fgcs-1.png?alt=media&#x26;token=11a7b4bd-6065-4cd9-a54b-f78ba0950207" alt=""></div>
{% endstep %}

{% step %}
**Add `storage.objects.create` permission**

<div align="left"><img src="https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2FAGQ77gWpG52CENqdPFH6%2Fgcs-2.png?alt=media&#x26;token=dd9c9cdb-e314-4d7f-b44c-d24ade92ba03" alt=""></div>
{% endstep %}

{% step %}
**Assign it to Oxylabs**

In the **New members** field, enter the following **Oxylabs service account email**:

```
oxyserps-storage@oxyserps-storage.iam.gserviceaccount.com
```

<div align="left"><img src="https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2FSPPnxlcOgf9M7oNgFa6p%2Fgcs-3.png?alt=media&#x26;token=c37c98fb-6465-4137-a1d6-6d13883036df" alt=""></div>
{% endstep %}
{% endstepper %}

## **Amazon S3**

The payload below makes Web Scraper API scrape `https://example.com` and put the result on an Amazon S3 bucket.

```json
{
    "source": "universal",
    "query": "https://example.com",
    "storage_type": "s3",
    "storage_url": "bucket_name/path"
}
```

To get your job results uploaded to your Amazon S3 bucket, please set up access permissions for our service. To do that, go to [**https://s3.console.aws.amazon.com/**](https://s3.console.aws.amazon.com/) → **`S3`** → **`Storage`** → **`Bucket Name`**` ``(if you don't have one, create a new one)` → **`Permissions`** → **`Bucket Policy`**.

<div align="left"><img src="https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2Fys4s7QoGvRxiqUF1HLMJ%2Fs3_bucket_policy.png?alt=media&#x26;token=bb48a995-8623-4358-bee6-b5e358874591" alt=""></div>

You can find the bucket policy attached below or in the code sample area.

{% file src="<https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2FDk6z5zm41f4Z0YMlHjmp%2Fs3_bucket_policy.json?alt=media&token=831c33a5-86c3-4627-88d9-b6dbc9bd3002>" %}
s3 bucket policy
{% endfile %}

Don't forget to change the bucket name under `YOUR_BUCKET_NAME`. This policy allows us to write to your bucket, give access to uploaded files to you, and know the location of the bucket.

```json
{
    "Version": "2012-10-17",
    "Id": "Policy1577442634787",
    "Statement": [
        {
            "Sid": "Stmt1577442633719",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::324311890426:user/oxylabs.s3.uploader"
            },
            "Action": "s3:GetBucketLocation",
            "Resource": "arn:aws:s3:::YOUR_BUCKET_NAME"
        },
        {
            "Sid": "Stmt1577442633719",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::324311890426:user/oxylabs.s3.uploader"
            },
            "Action": [
                "s3:PutObject",
                "s3:PutObjectAcl"
            ],
            "Resource": "arn:aws:s3:::YOUR_BUCKET_NAME/*"
        }
    ]
}
```

## Alibaba Cloud Object Storage Service (OSS)

The payload below makes Web Scraper API scrape `https://example.com` and put the result on an Alibaba Cloud OSS bucket.&#x20;

```json
{
    "source": "universal",
    "query": "https://example.com",
    "storage_type": "s3_compatible",
    "storage_url": "https://ACCESS_KEY_ID:ACCESS_KEY_SECRET@BUCKET_NAME.oss-REGION.aliyuncs.com/FOLDER_NAME"
}
```

### Forming the Storage URL <a href="#forming-the-storage-url" id="forming-the-storage-url"></a>

Storage URL format:

```http
https://ACCESS_KEY_ID:ACCESS_KEY_SECRET@BUCKET_NAME.oss-REGION.aliyuncs.com/FOLDER_NAME
```

{% hint style="warning" %}
Currently, **we cannot upload to the root bucket**. Please provide a specific folder name for your uploads.
{% endhint %}

Here’s where you'll find the `BUCKET_NAME` and `oss-REGION` of your bucket:

<figure><img src="https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2F9u06pWqKmDdieoCxhA7z%2Fimage.png?alt=media&#x26;token=5e11fd2c-1ed6-4f64-b757-5b33e332f452" alt=""><figcaption></figcaption></figure>

### Creating the Access Key and Secret <a href="#creating-the-access-key-and-secret" id="creating-the-access-key-and-secret"></a>

In order to use the S3-compatible interface with Alibaba OSS, you must create the `ACCESS_KEY_ID` and `ACCESS_KEY_SECRET` as shown below. For more information, see [![](https://img.alicdn.com/tfs/TB1ugg7M9zqK1RjSZPxXXc4tVXa-32-32.png)How to use Amazon S3 SDKs to access OSS](https://www.alibabacloud.com/help/en/oss/developer-reference/use-amazon-s3-sdks-to-access-oss?spm=a2c63.p38356.0.i1).

{% stepper %}
{% step %}
Go to the **AccessKey Account Menu**

<div align="left"><figure><img src="https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2F8PTfsezldRgtJxg1BIFP%2Fimage.png?alt=media&#x26;token=b6d5a855-af2e-45a8-9089-f271f460d675" alt="" width="422"><figcaption></figcaption></figure></div>
{% endstep %}

{% step %}
Log on to the **RAM console**

Access the [RAM console](https://ram.console.aliyun.com/) by using an **Alibaba Cloud account** or a **RAM user** who has administrative rights.
{% endstep %}

{% step %}
Go to **Identities** → **Users** in the left-side navigation pane
{% endstep %}

{% step %}
Select **Create User** and use the **RAM User AccessKey:**

<div align="left"><figure><img src="https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2FEttslEfZDeCk3xj2JM2v%2Fimage.png?alt=media&#x26;token=47c0a6ea-1c45-410d-a3ba-9eaf386d021f" alt=""><figcaption></figcaption></figure></div>

<div align="left"><figure><img src="https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2FFPmvTj6XvjPRWEgBal9v%2Fimage.png?alt=media&#x26;token=75012ded-20e2-4b0b-9cf6-9f97c0ecd695" alt=""><figcaption></figcaption></figure></div>
{% endstep %}

{% step %}
**Grant permissions to the RAM user**

The newly created RAM user has no permissions. You must grant **AliyunOSSFullAccess** permissions to the RAM user. Then, the RAM user can access the required Alibaba Cloud resources. For more information, see [Grant permissions to RAM users](https://www.alibabacloud.com/help/en/ram/user-guide/grant-permissions-to-the-ram-user#task-187800).

<div align="left"><figure><img src="https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2F5QWg84VYUgbasaDa2JVH%2Fimage.png?alt=media&#x26;token=ea29159b-2b43-4dec-bf97-5d11ecd1e21e" alt=""><figcaption></figcaption></figure></div>
{% endstep %}

{% step %}
Get your **AccessKey ID** and **AccessKey Secret**

When permissions are granted, return to the **Authentication** section and, in the **Access Key** section, select **Create AccessKey**. Choose to create an Access Key for a **Third-Party service**. You'll then see an `ACCESS_KEY_ID` and `ACCESS_KEY_SECRET`, which you can then use in your requests.
{% endstep %}
{% endstepper %}

### Alibaba OSS Rate limits <a href="#alibaba-oss-ratelimits" id="alibaba-oss-ratelimits"></a>

When doing concurrent uploads to Alibaba OSS, it's possible to hit their account/bucket rate limits, and the uploads will start timing out with the following error:

<div align="left"><figure><img src="https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2FkTaFNVnlGImMKuAMGXso%2Fimage.png?alt=media&#x26;token=85b4ca21-c9dc-4489-bc91-da7e9ffa22e7" alt="" width="563"><figcaption></figcaption></figure></div>

In this case, please contact Alibaba OSS support to increase your OSS rate limits.

## BytePlus TOS

You can upload scraped results directly to a BytePlus Torch Object Storage (TOS) bucket.

For a successful connection, you will need:

* A correctly configured TOS bucket.
* Your access key and secret key.
* An S3-compatible endpoint.

You can find a list of all available S3 endpoints in the official BytePlus [documentation](https://docs.byteplus.com/en/docs/tos/docs-region-and-endpoint).

### Example

The following payload will scrape `https://example.com` and upload the results to your TOS bucket.

```json
{
    "source": "universal",
    "query": "https://example.com",
    "storage_type": "tos",
    "storage_url": "https://access_key:secret_key@endpoint/bucket_name/path"
}
```

### Parameters <a href="#docs-internal-guid-dbfc45c7-7fff-9754-b71a-bacb24e2ac54" id="docs-internal-guid-dbfc45c7-7fff-9754-b71a-bacb24e2ac54"></a>

<table><thead><tr><th width="166">Parameter</th><th width="165">Available values</th><th>Description</th></tr></thead><tbody><tr><td><code>storage_type</code></td><td><code>tos</code></td><td>Specifies BytePlus TOS as the storage provider.</td></tr><tr><td><code>storage_url</code></td><td>URL string</td><td>Authenticated URL to your TOS bucket (see format below).</td></tr></tbody></table>

### Storage URL Format

The `storage_url` must be constructed using your TOS credentials and bucket details.

```bash
https://access_key:secret_key@endpoint/bucket_name/path
```

| Component     | Description                                                            |
| ------------- | ---------------------------------------------------------------------- |
| `access_key`  | Your BytePlus access key ID.                                           |
| `secret_key`  | Your BytePlus secret access key.                                       |
| `endpoint`    | The region-specific endpoint (e.g., `tos-cn-hongkong.bytepluses.com`). |
| `bucket_name` | Destination bucket name.                                               |
| `path`        | *(Optional)* Bucket's specific folder path.                            |

{% hint style="warning" %}
If your Access Key or Secret Key contains special characters (such as `/`, `+`, or `=`), they **must be URL-encoded** before constructing the string.
{% endhint %}

### Output File Naming

Oxylabs automatically generates filenames for the uploaded objects based on the job details:

* **HTML/Content:** `{query_id}_{timestamp}.html`
* **Parsed Data:** `{query_id}_results.json`

Files will be accessible in your bucket at: `tos://{bucket_name}/{path}/{filename}`

## Other S3-compatible storage

If you'd like to get your results delivered to an S3-compatible storage location, you'll have to include your bucket's `ACCESS_KEY:SECRET` auth string in the `storage_url` value in the payload:

```json
{
    "source": "universal",
    "url": "https://example.com",
    "storage_type": "s3_compatible",
    "storage_url": "https://ACCESS_KEY:SECRET@s3.oxylabs.io/my-videos"
}
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://developers.oxylabs.io/scraping-solutions/web-scraper-api/features/result-processing-and-storage/cloud-storage.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
