How to Scrape With Headless Firefox
Discover how to use headless Firefox with Selenium, Playwright, and Puppeteer for web scraping, including practical examples for each library.
Selenium is a popular web browser automation library used for web scraping. To run, however, Selenium needs special web browser executables called drivers. For example, to run Firefox web browser Selenium needs geckodriver to be installed. Without it a generic exception will be raised:
selenium.common.exceptions.WebDriverException: Message: 'geckodriver' executable needs to be in PATH.
This can also mean that the geckodriver is installed but Selenium can't find it. To fix this the geckodriver location should be added to the PATH
environment variable:
$ export PATH=$PATH:/location/where/geckodriver/is/
Alternatively, we can specify the driver directly in the Selenium initiation code:
from selenium import webdriver
driver = webdriver.Firefox(executable_path=r'your\path\geckodriver.exe')
driver.get('https://scrapfly.io/')