How to Scrape With Headless Firefox
Discover how to use headless Firefox with Selenium, Playwright, and Puppeteer for web scraping, including practical examples for each library.
To test our Puppeteer web scrapers we might want o use local files instead of public websites. Just like real web browsers Puppeteer can load local files using the file://
URL protocol. Here's an example in Python:
from playwright import sync_playwright
with sync_playwright() as pw:
browser = pw.chromium.launch(headless=False)
context = browser.new_context(viewport={"width": 1920, "height": 1080})
page = context.new_page()
# open a local file (note: absolute path needs to be used)
page.goto("file://home/user/projects/test.html"); # linux
page.goto("file://C:/Users/projects/test.html"); # windows
print(page.content())