How to Scrape With Headless Firefox
Discover how to use headless Firefox with Selenium, Playwright, and Puppeteer for web scraping, including practical examples for each library.
Taking headless browser screenshots can be a useful debugging and data collection tool when web scraping. With Selenium and Python, to take screenshots the save_screenshot()
method can be used to capture the whole page or a specific area:
from selenium import webdriver
driver = webdriver.Chrome()
driver.get("https://httpbin.dev/html")
# For whole page
# we can save directly to a given filename
driver.save_screenshot('screenshot.png')
# or retrieve to python objects
screenshot_png_bytes = driver.get_screenshot_as_png()
screenshot_base64_string = driver.get_screenshot_as_base64()
# For specific element we should find the element first and then capture it:
from selenium.webdriver.common.by import By
element = driver.find_element(By.CSS_SELECTOR, 'p')
element.screenshot('just-the-paragraph.png')
driver.close()
Note that when scraping dynamic pages screenshot command might run before page is fully loaded thus missing important details. For that refer to How to wait for page to load in Selenium?
For more, see web scraping with Selenium and Python