# Cloud Storage

Scraper API job results are stored in our storage. You can get your results from our storage by `GET`ting the `/results` endpoint.

As an alternative, we can upload the results to your cloud storage. This way, you don't have to make extra requests to fetch results – everything goes directly to your storage bucket.

{% hint style="info" %}
Cloud storage integration works only with [**Push-Pull**](https://developers.oxylabs.io/scraping-solutions/web-scraper-api/integration-methods/push-pull) integration method.
{% endhint %}

Currently, we support these cloud storage services:

* [**Google Cloud Storage**](#google-cloud-storage)
* [**Amazon S3**](#amazon-s3)
* [**Alibaba Cloud Object Storage Service (OSS)**](#alibaba-cloud-object-storage-service-oss)
* [**BytePlus Torch Object Storage (TOS)**](#byteplus-tos)
* [**Other S3-compatible storage**](#other-s3-compatible-storage)

If you'd like to use a different type of storage, please contact your account manager to discuss the feature delivery timeline.

The upload path looks like this: `YOUR_BUCKET_NAME/job_ID.json`. You'll find the job ID in the response that you receive from us after submitting a job.

#### Input

<table><thead><tr><th>Parameter</th><th width="187.33333333333331">Description</th><th>Valid values</th></tr></thead><tbody><tr><td><code>storage_type</code></td><td>Your cloud storage type.</td><td><p><code>gcs</code> (Google Cloud Storage);</p><p><code>s3</code> (AWS S3); <code>tos</code> (BytePlus TOS); </p><p><code>s3_compatible</code> (any S3-compatible storage).</p></td></tr><tr><td><code>storage_url</code></td><td>Your cloud storage bucket name / URL.</td><td><ul><li>Any <code>s3</code> , <code>gcs</code> , or <code>tos</code> bucket name;</li><li>Any <code>s3-compatible</code> storage URL.</li></ul></td></tr></tbody></table>

## **Google Cloud Storage**

The payload below makes Web Scraper API scrape `https://example.com` and put the result on a Google Cloud Storage bucket.

```json
{
    "source": "universal",
    "query": "https://example.com",
    "storage_type": "gcs",
    "storage_url": "bucket_name/path"
}
```

To get your job results uploaded to your Google Cloud Storage bucket, please set up special permissions for our service as shown below:

{% stepper %}
{% step %}
**Create a custom role**

<div align="left"><img src="https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2FpMD51jtIU5SRROKn44uU%2Fgcs-1.png?alt=media&#x26;token=11a7b4bd-6065-4cd9-a54b-f78ba0950207" alt=""></div>
{% endstep %}

{% step %}
**Add `storage.objects.create` permission**

<div align="left"><img src="https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2FAGQ77gWpG52CENqdPFH6%2Fgcs-2.png?alt=media&#x26;token=dd9c9cdb-e314-4d7f-b44c-d24ade92ba03" alt=""></div>
{% endstep %}

{% step %}
**Assign it to Oxylabs**

In the **New members** field, enter the following **Oxylabs service account email**:

```
oxyserps-storage@oxyserps-storage.iam.gserviceaccount.com
```

<div align="left"><img src="https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2FSPPnxlcOgf9M7oNgFa6p%2Fgcs-3.png?alt=media&#x26;token=c37c98fb-6465-4137-a1d6-6d13883036df" alt=""></div>
{% endstep %}
{% endstepper %}

## **Amazon S3**

The payload below makes Web Scraper API scrape `https://example.com` and put the result on an Amazon S3 bucket.

```json
{
    "source": "universal",
    "query": "https://example.com",
    "storage_type": "s3",
    "storage_url": "bucket_name/path"
}
```

To get your job results uploaded to your Amazon S3 bucket, please set up access permissions for our service. To do that, go to [**https://s3.console.aws.amazon.com/**](https://s3.console.aws.amazon.com/) → **`S3`** → **`Storage`** → **`Bucket Name`**` ``(if you don't have one, create a new one)` → **`Permissions`** → **`Bucket Policy`**.

<div align="left"><img src="https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2Fys4s7QoGvRxiqUF1HLMJ%2Fs3_bucket_policy.png?alt=media&#x26;token=bb48a995-8623-4358-bee6-b5e358874591" alt=""></div>

You can find the bucket policy attached below or in the code sample area.

{% file src="<https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2FDk6z5zm41f4Z0YMlHjmp%2Fs3_bucket_policy.json?alt=media&token=831c33a5-86c3-4627-88d9-b6dbc9bd3002>" %}
s3 bucket policy
{% endfile %}

Don't forget to change the bucket name under `YOUR_BUCKET_NAME`. This policy allows us to write to your bucket, give access to uploaded files to you, and know the location of the bucket.

```json
{
    "Version": "2012-10-17",
    "Id": "Policy1577442634787",
    "Statement": [
        {
            "Sid": "Stmt1577442633719",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::324311890426:user/oxylabs.s3.uploader"
            },
            "Action": "s3:GetBucketLocation",
            "Resource": "arn:aws:s3:::YOUR_BUCKET_NAME"
        },
        {
            "Sid": "Stmt1577442633719",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::324311890426:user/oxylabs.s3.uploader"
            },
            "Action": [
                "s3:PutObject",
                "s3:PutObjectAcl"
            ],
            "Resource": "arn:aws:s3:::YOUR_BUCKET_NAME/*"
        }
    ]
}
```

## Alibaba Cloud Object Storage Service (OSS)

The payload below makes Web Scraper API scrape `https://example.com` and put the result on an Alibaba Cloud OSS bucket.&#x20;

```json
{
    "source": "universal",
    "query": "https://example.com",
    "storage_type": "s3_compatible",
    "storage_url": "https://ACCESS_KEY_ID:ACCESS_KEY_SECRET@BUCKET_NAME.oss-REGION.aliyuncs.com/FOLDER_NAME"
}
```

### Forming the Storage URL <a href="#forming-the-storage-url" id="forming-the-storage-url"></a>

Storage URL format:

```http
https://ACCESS_KEY_ID:ACCESS_KEY_SECRET@BUCKET_NAME.oss-REGION.aliyuncs.com/FOLDER_NAME
```

{% hint style="warning" %}
Currently, **we cannot upload to the root bucket**. Please provide a specific folder name for your uploads.
{% endhint %}

Here’s where you'll find the `BUCKET_NAME` and `oss-REGION` of your bucket:

<figure><img src="https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2F9u06pWqKmDdieoCxhA7z%2Fimage.png?alt=media&#x26;token=5e11fd2c-1ed6-4f64-b757-5b33e332f452" alt=""><figcaption></figcaption></figure>

### Creating the Access Key and Secret <a href="#creating-the-access-key-and-secret" id="creating-the-access-key-and-secret"></a>

In order to use the S3-compatible interface with Alibaba OSS, you must create the `ACCESS_KEY_ID` and `ACCESS_KEY_SECRET` as shown below. For more information, see [![](https://img.alicdn.com/tfs/TB1ugg7M9zqK1RjSZPxXXc4tVXa-32-32.png)How to use Amazon S3 SDKs to access OSS](https://www.alibabacloud.com/help/en/oss/developer-reference/use-amazon-s3-sdks-to-access-oss?spm=a2c63.p38356.0.i1).

{% stepper %}
{% step %}
Go to the **AccessKey Account Menu**

<div align="left"><figure><img src="https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2F8PTfsezldRgtJxg1BIFP%2Fimage.png?alt=media&#x26;token=b6d5a855-af2e-45a8-9089-f271f460d675" alt="" width="422"><figcaption></figcaption></figure></div>
{% endstep %}

{% step %}
Log on to the **RAM console**

Access the [RAM console](https://ram.console.aliyun.com/) by using an **Alibaba Cloud account** or a **RAM user** who has administrative rights.
{% endstep %}

{% step %}
Go to **Identities** → **Users** in the left-side navigation pane
{% endstep %}

{% step %}
Select **Create User** and use the **RAM User AccessKey:**

<div align="left"><figure><img src="https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2FEttslEfZDeCk3xj2JM2v%2Fimage.png?alt=media&#x26;token=47c0a6ea-1c45-410d-a3ba-9eaf386d021f" alt=""><figcaption></figcaption></figure></div>

<div align="left"><figure><img src="https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2FFPmvTj6XvjPRWEgBal9v%2Fimage.png?alt=media&#x26;token=75012ded-20e2-4b0b-9cf6-9f97c0ecd695" alt=""><figcaption></figcaption></figure></div>
{% endstep %}

{% step %}
**Grant permissions to the RAM user**

The newly created RAM user has no permissions. You must grant **AliyunOSSFullAccess** permissions to the RAM user. Then, the RAM user can access the required Alibaba Cloud resources. For more information, see [Grant permissions to RAM users](https://www.alibabacloud.com/help/en/ram/user-guide/grant-permissions-to-the-ram-user#task-187800).

<div align="left"><figure><img src="https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2F5QWg84VYUgbasaDa2JVH%2Fimage.png?alt=media&#x26;token=ea29159b-2b43-4dec-bf97-5d11ecd1e21e" alt=""><figcaption></figcaption></figure></div>
{% endstep %}

{% step %}
Get your **AccessKey ID** and **AccessKey Secret**

When permissions are granted, return to the **Authentication** section and, in the **Access Key** section, select **Create AccessKey**. Choose to create an Access Key for a **Third-Party service**. You'll then see an `ACCESS_KEY_ID` and `ACCESS_KEY_SECRET`, which you can then use in your requests.
{% endstep %}
{% endstepper %}

### Alibaba OSS Rate limits <a href="#alibaba-oss-ratelimits" id="alibaba-oss-ratelimits"></a>

When doing concurrent uploads to Alibaba OSS, it's possible to hit their account/bucket rate limits, and the uploads will start timing out with the following error:

<div align="left"><figure><img src="https://63892162-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FzrXw45naRpCZ0Ku9AjY1%2Fuploads%2FkTaFNVnlGImMKuAMGXso%2Fimage.png?alt=media&#x26;token=85b4ca21-c9dc-4489-bc91-da7e9ffa22e7" alt="" width="563"><figcaption></figcaption></figure></div>

In this case, please contact Alibaba OSS support to increase your OSS rate limits.

## BytePlus TOS

You can upload scraped results directly to a BytePlus Torch Object Storage (TOS) bucket. Please note you you must have your **bucked set up correctly** and have both you **access key** and **secret key** available for cloud storage access.

The example payload below makes Web Scraper API scrape `https://example.com` and put the result on an BytePlus TOS bucket.

```json
{
    "source": "universal",
    "query": "https://example.com",
    "storage_type": "tos",
    "storage_url": "https://access_key:secret_key@endpoint/bucket_name/path"
}
```

### Parameters <a href="#docs-internal-guid-dbfc45c7-7fff-9754-b71a-bacb24e2ac54" id="docs-internal-guid-dbfc45c7-7fff-9754-b71a-bacb24e2ac54"></a>

<table><thead><tr><th width="166">Parameter</th><th width="165">Value</th><th>Description</th></tr></thead><tbody><tr><td><code>storage_type</code></td><td>tos</td><td>Specifies BytePlus TOS as the storage provider.</td></tr><tr><td><code>storage_url</code></td><td>String (URL)</td><td>Authenticated URL to your TOS bucket (see format below).</td></tr></tbody></table>

### Storage URL Format

The `storage_url` must be constructed using your TOS credentials and bucket details.

```bash
https://access_key:secret_key@endpoint/bucket_name/path
```

| Component     | Description                                                            |
| ------------- | ---------------------------------------------------------------------- |
| `access_key`  | Your BytePlus access key ID.                                           |
| `secret_key`  | Your BytePlus secret access key.                                       |
| `endpoint`    | The region-specific endpoint (e.g., `tos-cn-hongkong.bytepluses.com`). |
| `bucket_name` | Destination bucket name.                                               |
| `path`        | *(Optional)* Bucket's specific folder path.                            |

{% hint style="warning" %}
If your Access Key or Secret Key contains special characters (such as `/`, `+`, or `=`), they **must be URL-encoded** before constructing the string.
{% endhint %}

### Output File Naming

Oxylabs automatically generates filenames for the uploaded objects based on the job details:

* **HTML/Content:** `{query_id}_{timestamp}.html`
* **Parsed Data:** `{query_id}_results.json`

Files will be accessible in your bucket at: `tos://{bucket_name}/{path}/{filename}`

## Other S3-compatible storage

If you'd like to get your results delivered to an S3-compatible storage location, you'll have to include your bucket's `ACCESS_KEY:SECRET` auth string in the `storage_url` value in the payload:

```json
{
    "source": "universal",
    "url": "https://example.com",
    "storage_type": "s3_compatible",
    "storage_url": "https://ACCESS_KEY:SECRET@s3.oxylabs.io/my-videos"
}
```
