Octoparse is a data extraction tool. It allows you to scrape public data without coding and bypass most anti-scraping mechanisms by enabling automatic IP rotation and extended session time.
To integrate Octoparse with Oxylabs Dedicated Datacenter Proxies, follow the simple steps below or watch this video tutorial:
Step 2. Create a new task by clicking the +New button in the top-left corner and choosing Custom Task.
Step 3. Type the URL of the webpage you intend to extract data from in the URL Input field and click the Save button. We'll use books.toscrape.com as an example.
Step 4. After your selected URL loads, click the top-right Settings button.
Step 5. Scroll down to the Anti-blocking Settings.
Step 6. Put a checkmark in the Access websites via proxies box. After this step, you will see Use my own proxies and the Configure button.
Step 7. A pop-up window will appear when you click the Octoparse Configure button. Octoparse only works with
IP:PORT-based format. For Dedicated Datacenter proxies, enter these details:
IP: a specific IP address (
In the case of Dedicated Datacenter Proxies, you will need to choose an IP address from the purchased list. Please refer to our
documentationfor more details.
If you’re using a
If you’re using whitelisted IPs:
Look at the example below.
Step 8. Set up the Switch interval depending on whether you use a rotating or sticky session type.
Step 9. Save changes by clicking the Confirm button.
Step 10. To ensure the Octoparse integration was successful, check if there is a checkmark next to the Configure button in the Anti-blocking settings section.
Step 11. Click the Save button, and it'll bring you to the main screen of the page you’re scraping.
You've successfully set up Oxylabs' proxies with Octoparse.
Below are some additional steps on how to start scraping:
Step 12. Click on the lightbulb, which will expand and give you choices on whether to paginate or add a page scroll.
Step 13. After you’ve made your choice, click on the Create Workflow button.
Step 14. This will allow you to select a page element you’d like to extract from. In our case, we’ll choose Mystery. Click on it and select Extract text of the selected element.
Step 15. Afterward, you’ll be presented with the pop-up below. At the top-right, click Save and then Run.
Step 16. A pop-up will appear with multiple choices. Choose whichever is most relevant for you (some are paid options) and continue. For example, we’ll pick Run on your device and Standard mode.
Step 17. A new page will open where the scraping process will begin. You can pause and resume it whenever you want.
Step 18. Since this is merely an example, we’ll stop here. Confirm to stop the run.
Step 19. Here, some statistics will be shown for your scraping task. You can choose to export data later or now.
Step 20. If you select Export Data, The last pop-up will appear, allowing you to choose a format for extracting the data.
Step 21. Pick which one is relevant for you.
That’s it – you are set up and ready to focus on your web scraping tasks with Octoparse.