What is a Headless Browser? Top 5 Headless Browser Tools
Quick overview of new emerging tech of browser automation - what exactly are these tools and how are they used in web scraping?
To test our Puppeteer web scrapers we might want o use local files instead of public websites. Just like real web browsers Puppeteer can load local files using the file://
URL protocol. Here's an example in Python:
from playwright import sync_playwright
with sync_playwright() as pw:
browser = pw.chromium.launch(headless=False)
context = browser.new_context(viewport={"width": 1920, "height": 1080})
page = context.new_page()
# open a local file (note: absolute path needs to be used)
page.goto("file://home/user/projects/test.html"); # linux
page.goto("file://C:/Users/projects/test.html"); # windows
print(page.content())
This knowledgebase is provided by Scrapfly — a web scraping API that allows you to scrape any website without getting blocked and implements a dozens of other web scraping conveniences. Check us out 👇