What is a Headless Browser? Top 5 Headless Browser Tools
Quick overview of new emerging tech of browser automation - what exactly are these tools and how are they used in web scraping?
When web scraping, we often need to save the connection state like browser cookies and resume it later. Using Selenium, to save and load cookies we can use driver.get_cookies()
and driver.add_cookie()
methods:
import json
from pathlib import Path
from selenium import webdriver
driver = webdriver.Chrome()
driver.get("http://www.google.com")
# Get cookies to a json file:
Path('cookies.json').write_text(
json.dumps(driver.get_cookies(), indent=2)
)
# retrieve cookies from a json file
for cookie in json.loads(Path('cookies.json').read_text()):
driver.add_cookie(cookie)
driver.quit()
This knowledgebase is provided by Scrapfly — a web scraping API that allows you to scrape any website without getting blocked and implements a dozens of other web scraping conveniences. Check us out 👇