Learn about the header you can include in your requests to get fully rendered data, which you can store in an HTML file or as a PNG screenshot.
If the page you wish to scrape requires loading JavaScript to dynamically load all required data into the DOM, instead of setting up and using a headless browser by yourself, you can include the "X-Oxylabs-Render: html" header with your requests. All requests with this header included will be fully rendered, and all data will be stored in an HTML file or PNG screenshot (depending on the passed parameter).
JavaScript rendering takes more time to scrape the page. When using JavaScript rendering, set client side timeout value to 180 seconds.
To ensure lowest traffic consumption, our system does not load unnecessary assets during page rendering.
import requests# Use your Web Unblocker credentials here.USERNAME,PASSWORD='YOUR_USERNAME','YOUR_PASSWORD'# Define proxy dict.proxies ={'http':f'http://{USERNAME}:{PASSWORD}@unblock.oxylabs.io:60000','https':f'https://{USERNAME}:{PASSWORD}@unblock.oxylabs.io:60000',}headers ={'X-Oxylabs-Render':'html'}response = requests.get('https://ip.oxylabs.io/location',verify=False,# It is required to ignore certificateproxies=proxies,headers=headers,)# Print result page to stdoutprint(response.text)# Save returned HTML to result.html filewithopen('result.html','w')as f: f.write(response.text)
Scraping a website HTML
In this example, we will render the YouTube home page and scrape the page content. Normally Youtube homepage would look like this if Web Unblocker is used without Javascript rendering:
Youtube page example without JavaScript rendering
Adding the "X-Oxylabs-Render: html" header, as provided in the examples below, will enable Javascript rendering and return an HTML of a rendered page:
The HTML file opened in a browser should look like this:
Getting a screenshot of a fully rendered page
To get a screenshot in PNG format instead of page HTML, it is required to provide the "X-Oxylabs-Render: png" header.
The response will contain raw bytes of an image that can be saved in PNG format and opened as in the example below:
Youtube page example as a screenshot in PNG format
Forcing rendering on specific pages
For successful scraping, some page types of specific domains require rendering due to their dynamic content. Our system automatically enforces rendering for these pages, even if not explicitly set by the user.
Please note that rendered jobs consume more traffic compared to non-rendered jobs.
We want our users to be fully aware of this when scraping the following pages:
import requests
# Use your Web Unblocker credentials here.
USERNAME, PASSWORD = 'YOUR_USERNAME', 'YOUR_PASSWORD'
# Define proxy dict.
proxies = {
'http': f'http://{USERNAME}:{PASSWORD}@unblock.oxylabs.io:60000',
'https': f'https://{USERNAME}:{PASSWORD}@unblock.oxylabs.io:60000',
}
headers = {
'X-Oxylabs-Render': 'html'
}
response = requests.get(
'https://youtube.com',
verify=False, # It is required to ignore certificate
proxies=proxies,
headers=headers,
)
# Print result page to stdout
print(response.text)
# Save returned HTML to result.html file
with open('result.html', 'w') as f:
f.write(response.text)