circle-check
Documentation has been updated: see help center and changelog in one place.

LangChain

Use the LangChain framework alongside the Oxylabs Web Scraper API to pull web-data and feed it into LLM workflows—collect, process, analyse in one pipeline.

The LangChain integration with the Oxylabs Web Scraper APIarrow-up-right enables you to collect and process web data through an LLM (Large Language Model) in the same workflow.

Overview

LangChain is a framework for building apps that use LLMs alongside tools, APIs, and web data. It supports both Python and JavaScript. Use it with Oxylabs Web Scraper API arrow-up-rightto:

  • Scrape structured data without handling CAPTCHAs, IP blocks, or JS rendering

  • Process results with an LLM in the same pipeline

  • Build end-to-end workflows from extraction to AI-powered output

Getting started

Create your API user credentials: sign up for a free trial or purchase the product in the Oxylabs dashboardarrow-up-right to create your API user credentials (USERNAME and PASSWORD).

circle-exclamation

In this guide we will use Python programming language. Install the required libraries using pip:

pip install -qU langchain-oxylabs langchain-openai langgraph requests python-dotenv

Environment setup

Create a .env file in your project directory with your Oxylabs API user and OpenAI credentials:

OXYLABS_USERNAME=your-username
OXYLABS_PASSWORD=your-password
OPENAI_API_KEY=your-openai-key

Load these environment variables in your Python script:

Integration methods

There are two primary ways to integrate Oxylabs Web Scraper API with LangChain:

Using langchain-oxylabs package

For Google search queries, use the dedicated langchain-oxylabsarrow-up-right package, which provides a ready-to-use integration:

Using the Web Scraper API

For accessing other websites beyond Google search, you can directly send request to the Web Scraper API:

Target-specific scrapers

Oxylabs provides specialized scrapers for various popular websites. Here are some examples of available sources:

Website
Source parameter
Required parameters

Google

google_search

query

Amazon

amazon_search

query, domain (optional)

Walmart

walmart_search

query

Target

target_search

query

Kroger

kroger_search

query, store_id

Staples

staples_search

query

To use a specific scraper, modify the payload in the scrape_website function:

Advanced configuration

Handling dynamic content

The Web Scraper API can handle JavaScript rendering by adding the render parameter:

Setting user agent type

You can specify different user agents to simulate different devices:

Using target-specific parameters

Many target-specific scrapers support additional parameters:

Error handling

Implement proper error handling for production applications:

Last updated

Was this helpful?