How to Scrape With Headless Firefox
Discover how to use headless Firefox with Selenium, Playwright, and Puppeteer for web scraping, including practical examples for each library.
When web scraping, we often need to save the connection state like browser cookies and resume it later. Using Selenium, to save and load cookies we can use driver.get_cookies()
and driver.add_cookie()
methods:
import json
from pathlib import Path
from selenium import webdriver
driver = webdriver.Chrome()
driver.get("http://www.google.com")
# Get cookies to a json file:
Path('cookies.json').write_text(
json.dumps(driver.get_cookies(), indent=2)
)
# retrieve cookies from a json file
for cookie in json.loads(Path('cookies.json').read_text()):
driver.add_cookie(cookie)
driver.quit()