> For the complete documentation index, see [llms.txt](https://developers.oxylabs.io/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://developers.oxylabs.io/integrations/cn/proxy-integrations/dedicated-isp-proxies-guides/octoparse.md). # Octoparse [**Octoparse**](https://www.octoparse.com/) 是一款数据提取工具。它允许你无需编码即可抓取公开数据，并通过自动 IP 轮换和延长会话时间来处理大多数抓取挑战。将 Octoparse 与 Oxylabs 集成 [**独享ISP代理**](https://oxylabs.io/products/dedicated-isp-proxies) 你已通过自助服务购买，请按照以下简单步骤操作： **第 1 步。** [**下载**](https://www.octoparse.com/download/mac)、安装，然后打开 Octoparse。 **第 2 步。** 通过点击 **+New** 按钮（位于左上角），然后选择 **自定义任务。**

**第 3 步。** 在 **URL 输入框** 中输入你要从中提取数据的网页 URL，然后点击 **保存** 按钮。我们将以 [**Oxylabs 抓取沙盒**](https://sandbox.oxylabs.io/products/category/pc) 为例。

**第 4 步。** 在所选 URL 加载完成后，转到 **“任务设置”** 并选择 **“反屏蔽”**.

**第 5 步。** 现在，勾选 **通过代理访问网站**，启用 **使用我自己的代理，** 然后点击 **配置**.

**第 6 步。** 当你点击 **配置** 按钮时，会弹出一个窗口。请按以下格式指定代理详细信息： `IP/host:port:username:password`. 例如 **独享ISP代理**，你可以使用： **IP/主机**: `disp.oxylabs.io` **端口**: `8001` **用户名：** `user-USERNAME` **密码：** `PASSWORD` {% hint style="warning" %} **注意：** 别忘了添加 `user-` 前缀添加到你的用户名中。 {% endhint %}

{% hint style="success" %} 端口号表示你所获取的 [代理列表](/products/cn/dai-li/dedicated-isp-proxies/self-service/proxy-list.md) 将使用的 IP 地址。使用端口 `8000` 用于自动 [代理 IP 轮换](/products/cn/dai-li/dedicated-isp-proxies/self-service/proxy-rotation.md). {% endhint %} **第 7 步。** 设置 **切换** 间隔，具体取决于你使用的是静态 IP 还是 Proxy Rotator。

**第 8 步。** 通过点击 **确认** 按钮保存更改，然后点击 **保存**. 代理已设置完成。 ### 如何开始使用 Octoparse 抓取 **第 1 步。** 选择你想抓取的目标元素（电子游戏标题）。要提取同一类别中的所有元素，请选择 **选择所有相似元素** 并指定 **文本**.

**第 2 步。** 设置分页以抓取多页。该网站使用分页数字，因此你需要选择 **下一页按钮**.

**第 3 步。** 选择页面布局中打开下一页的准确按钮—— **前进** ——以自动分页。

**第 4 步。** 完成抓取设置并按下 **▶运行**.

**第 5 步。** 选择 **在你的设备上运行** 并使用 **标准模式** 将数据作为文件保存到你的电脑。

**第 6 步。** 让抓取过程运行直到完成。当到达最终产品页面或你手动停止时，过程就会结束。

**第 7 步。** 提取已收集的数据并选择文件格式。

以下是电子表格中的最终结果。

就是这样——你已经全部设置完成，可以开始专注于使用 Octoparse 进行网页抓取任务了。 --- # Agent Instructions This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com. ## Querying This Documentation If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question. Perform an HTTP GET request on the current page URL with the `ask` query parameter: ``` GET https://developers.oxylabs.io/integrations/cn/proxy-integrations/dedicated-isp-proxies-guides/octoparse.md?ask= ``` The question should be specific, self-contained, and written in natural language. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation. Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.