Documentation has been updated: see help center and changelog in one place.
⭐Explore
LogoLogo
Oxylabs dashboardContact usProduct
English
  • Documentation
  • Help center
  • Changelog
  • Overview
  • PROXIES
    • Integration Guides
      • Get IP Address for Integrations
      • Residential Proxies Guides
        • AdsPower
        • Android
        • ClonBrowser
        • Dolphin Anty
        • FoxyProxy
        • Ghost Browser
        • GoLogin
        • Helium Scraper
        • Incogniton
        • iOS
        • Kameleo
        • Lalicat Browser
        • MacOS
        • MoreLogin
        • MuLogin
        • Multilogin
        • Nstbrowser
        • Octoparse
        • Oxy® Proxy Extension for Chrome
        • ParseHub
        • Playwright
        • Proxifier
        • Puppeteer
        • Selenium
        • SEO Neo
        • SessionBox
        • Shadowrocket
        • Super Proxy
        • SwitchyOmega
        • Ubuntu
        • VMLogin
        • WebHarvy
        • Hidemyacc
      • ISP Proxies Guides
        • AdsPower
        • Android
        • Dolphin Anty
        • FoxyProxy
        • GoLogin
        • Incogniton
        • iOS
        • Lalicat Browser
        • MacOS
        • MoreLogin
        • MuLogin
        • Multilogin
        • Nstbrowser
        • Octoparse
        • Oxy® Proxy Extension for Chrome
        • Proxifier
        • SEO Neo
        • Shadowrocket
        • Sphere
        • Super Proxy
        • SwitchyOmega
        • Ubuntu
        • Hidemyacc
      • Mobile Proxies Guides
        • AdsPower
        • Android
        • ClonBrowser
        • Dolphin Anty
        • Ghost Browser
        • GoLogin
        • Helium Scraper
        • Incogniton
        • iOS
        • Kameleo
        • Lalicat Browser
        • MacOS
        • MoreLogin
        • MuLogin
        • Multilogin
        • Nstbrowser
        • Octoparse
        • Oxy® Proxy Extension for Chrome
        • ParseHub
        • Playwright
        • Proxifier
        • Puppeteer
        • Selenium
        • SEO Neo
        • SessionBox
        • Shadowrocket
        • SwitchyOmega
        • Ubuntu
        • VMLogin
        • WebHarvy
      • Dedicated Datacenter Proxies Guides
        • Enterprise
          • Dolphin Anty
          • FoxyProxy
          • GoLogin
          • Lalicat Browser
          • MoreLogin
          • MuLogin
          • Nstbrowser
          • Octoparse
          • Oxy® Proxy Extension for Chrome
          • Proxifier
          • SEO Neo
          • Shadowrocket
          • Sphere
          • Super Proxy
          • SwitchyOmega
          • Ubuntu
          • Hidemyacc
        • Self-Service
          • Android
          • Dolphin Anty
          • FoxyProxy
          • GoLogin
          • iOS
          • Lalicat Browser
          • MacOS
          • MoreLogin
          • MuLogin
          • Nstbrowser
          • Octoparse
          • Oxy® Proxy Extension for Chrome
          • Proxifier
          • SEO Neo
          • Shadowrocket
          • Sphere
          • Super Proxy
          • SwitchyOmega
          • Ubuntu
          • Hidemyacc
      • Datacenter Proxies Guides
        • AdsPower
        • Android
        • Dolphin Anty
        • FoxyProxy
        • GoLogin
        • iOS
        • Lalicat Browser
        • MacOS
        • MoreLogin
        • MuLogin
        • Nstbrowser
        • Octoparse
        • Oxy® Proxy Extension for Chrome
        • Proxifier
        • SEO Neo
        • Shadowrocket
        • Super Proxy
        • SwitchyOmega
        • Ubuntu
        • Hidemyacc
    • Residential Proxies
      • Getting Started
      • Making Requests
        • Entry Node for China
      • Location Settings
        • Country
        • City
        • State
        • Continent
        • ZIP/Postal code
        • Coordinates
        • ASN Targeting
      • Session Control
        • Sticky Proxy Entry Nodes
      • Protocols
      • Whitelisting IPs
        • Requests with Whitelisted IPs
      • Endpoint Generator
      • Restricted Targets
      • Public API
      • Response Codes
    • ISP Proxies
      • Making Requests
      • Proxy List
      • Proxy Rotation
      • Location Settings
      • Protocols
      • Whitelisting IPs
      • Response Codes
      • Restricted Targets
      • Fair usage policy
    • Mobile Proxies
      • Getting Started
      • Making Requests
        • Entry Node for China
      • Location Settings
        • Country
        • City
        • State
        • Continent
        • Coordinates
        • ASN Targeting
      • Session Control
        • Sticky Proxy Entry Nodes
      • Protocols
      • Whitelisting IPs
      • Endpoint Generator
      • Restricted Targets
      • Public API
      • Response Codes
    • Datacenter Proxies
      • Proxy List
      • IP Control
      • Select Country
      • Protocols
      • Whitelisting
      • Response Codes
      • Restricted Targets
      • Fair usage policy
      • Free Datacenter IPs
    • Dedicated Datacenter Proxies
      • Enterprise
        • Getting Started
        • Proxy List
        • Making Requests
        • Protocols
        • Whitelisting IPs
          • Dashboard
          • RESTful
            • Getting Whitelisted IPs List
            • Adding a Whitelisted IP
            • Removing a Whitelisted IP
            • Saving Changes (5min Cooldown)
        • Datacenter Proxy API
        • Proxy Rotator - Optional
        • Response Codes
      • Self-Service
        • Getting Started
        • Making Requests
        • Proxy List
        • Proxy Rotation
        • Location Settings
        • Protocols
        • Whitelisting IPs
        • Response Codes
        • Restricted Targets
        • Fair usage policy
    • Dedicated ISP Proxies
      • Getting Started
      • Proxy List
      • Making Requests
      • Protocols
      • Whitelisting IPs (RESTful)
        • Getting Whitelisted IPs List
        • Adding a Whitelisted IP
        • Removing a Whitelisted IP
        • Saving Changes (5min Cooldown)
      • Proxy API
      • Proxy Rotator - Optional
      • Response Codes
  • Advanced proxy solutions
    • Web Unblocker
      • Getting Started
      • Making Requests
        • Session
        • Geo-location
        • Headers & Cookies
        • Custom status code
        • POST requests
      • Headless Browser
        • JavaScript rendering
        • Browser instructions (Beta)
          • List of instructions
      • Sample Response
      • Response Codes
      • Rate Limits
      • Migration Guides
        • From Bright Data Web Unlocker
      • Usage Statistics
      • Billing Information
  • VIDEO DATA
    • High-Bandwidth Proxies
      • YouTube Downloader (yt_dlp) integration
  • Video Data API
  • Scraping Solutions
    • Web Scraper API
      • Integration Methods
        • Realtime
        • Push-Pull
        • Proxy Endpoint
      • Features
        • Localization
          • Proxy Location
          • SERP Localization
          • E-Commerce Localization
          • Domain, Locale, Results Language
        • JS Rendering & Browser Control
          • JavaScript Rendering
          • Browser Instructions
            • List of instructions
          • Capturing network requests (Fetch/XHR)
        • Result Processing & Storage
          • Dedicated Parsers
          • Custom Parser
            • Getting started
            • Parsing instruction examples
            • List of functions
              • Function examples
          • Download Images
          • Cloud Storage
        • HTTP Context & Job Management
          • Headers, Cookies, Method
          • User Agent Type
          • Client Notes
        • Scheduler
      • Solutions for AI Workflows
        • Model Context Protocol (MCP)
        • LangChain
        • LlamaIndex
      • Targets
        • Google
          • Search
            • Web Search
            • AI Overviews
            • Image Search
            • News Search
            • Local Search
            • Reverse Image Search
            • Google Suggest
          • Ads Max
          • Shopping
            • Shopping Product
            • Shopping Search
            • Shopping Pricing
          • Trends: Explore
          • Travel: Hotels
          • Lens
          • URL
        • Amazon
          • Product
          • Search
          • Pricing
          • Sellers
          • Best Sellers
          • Reviews
          • Questions & Answers
          • URL
        • YouTube
          • YouTube Scraping Guide for AI
          • YouTube Search
          • YouTube Video Trainability
          • YouTube Metadata
          • YouTube Downloader
          • YouTube Transcript
        • Generic Target
        • Walmart
          • Search
          • Product
        • Ebay
        • Etsy
          • Search
          • Product
        • Bing
          • Search
          • URL
        • ChatGPT
        • North American E-Commerce
          • Best Buy
            • Search
            • Product
          • Kroger
            • Product
            • Search
            • URL
          • Lowe's
            • Search
            • Product
            • URL
          • Target
            • Search
            • Product
            • Category
          • Bed Bath & Beyond
          • Costco
          • Menards
          • Petco
          • Staples
          • Grainger
          • Instacart
        • European E-Commerce
          • Allegro
            • Search
            • Product
          • Idealo
          • Mediamarkt
          • Cdiscount
        • Asian E-Commerce
          • Alibaba
          • Aliexpress
          • Lazada
          • Rakuten
          • Tokopedia
          • Flipkart
          • Avnet
          • Indiamart
        • Latin American E-Commerce
          • Mercado Livre
          • Magazine Luiza
          • Falabella
          • Dcard
      • Restricted Targets
      • Response Codes
      • Usage and Billing
        • Usage Statistics
        • Traffic and Billing
        • Rate Limits
    • OxyCopilot
    • Unblocking Browser
      • Chrome
      • Firefox
      • Restricted Targets
      • Integration with MCP
      • Troubleshooting Guide
  • Dashboard
    • Teams
    • Billing Information
      • Accessing Billing Information
      • Managing Payment Methods
      • Updating Billing Information
      • Canceling a Subscription
    • IP Replacement
  • Guides for Scraper APIs
    • Python SDK
    • Go SDK
    • Forming Requests
    • Forming URLs
    • Using Postman
  • Useful links
    • Oxylabs Dashboard
    • Release Notes
    • Network status
    • Open Source Tools
      • Oxy Parser
      • Oxy Mouse
      • Web Scraper API Scheduler
    • Discord Community
    • GitHub
    • Scraping Experts
  • SUPPORT
    • FAQ
    • Have a Question?
Powered by GitBook
On this page

Was this helpful?

  1. Scraping Solutions
  2. Web Scraper API
  3. Features
  4. Result Processing & Storage

Custom Parser

Custom Parser is a free Scraper API feature that lets you define your own parsing and data processing logic that is executed on a raw scraping result.

You can use CSS and XPath selectors to select an object in the HTML DOM.

To use Custom Parser, just pass a JSON object with instructions while submitting a job:

If you are using XPath selectors:

{
  "source": "universal_ecommerce",
  "url": "https://example.com",
  "parse": true,
  "parsing_instructions": {
      "title": {
          "_fns": [
              {
                  "_fn": "xpath_one",
                  "_args": ["//h1/text()"]
              }
          ]
      }
  }
}

You can conveniently use the text() function with XPath, which extracts the text value of the selected node.

If you are using CSS selectors, you’ll have to string two functions together: the first one will select the h1 element, while the second one will extract its text:

{
    "source": "universal_ecommerce",
    "url": "https://example.com",
    "parse": true,
    "parsing_instructions": {
        "title": {
            "_fns": [
                {"_fn": "css_one", "_args": ["body > div:nth-child(1) > h1"]},
                {"_fn": "element_text"}
            ]
        }
    }
}

The result will look like this:

{
    "results": [
        {
            "content": {
                "title": "Example Domain",
                "parse_status_code": 12000
            },
            "created_at": "2023-03-23 14:47:49",
            "updated_at": "2023-03-23 14:47:58",
            "page": 1,
            "url": "https://example.com",
            "job_id": "7044681146457663489",
            "status_code": 200
        }
    ]
}

Check out the list of available parsing and data transformation functions here and some instruction examples here.

Getting the HTML content of a parsed job

You may retrieve the raw HTML result by adding ?type=raw to the end of the result retrieval URL. Read more here.

PreviousDedicated ParsersNextGetting started

Last updated 8 months ago

Was this helpful?