How to run Playwright in Jupyter notebooks?

Playwright is a popular web browser automation library for Python which can be run in Jupyter notebooks for quick web scraping scripts. However, since Jupyter notebooks run its own asyncio loops we cannot start the synchronous playwright client:

# in Jupyter:
from playwright.sync_api import sync_playwright
playwright = sync_playwright().start()

"""
Error: It looks like you are using Playwright Sync API inside the asyncio loop.
Please use the Async API instead.
"""

To use Playwright in Jupyter notebooks we should use the asynchronous client explicitly:

# in Jupyter
from playwright.async_api import async_playwright

pw = await async_playwright().start()
browser = await pw.chromium.launch(headless=False)
page = await browser.new_page()

# note all methods are async (use the "await" keyword)
await page.goto("http://scrapfly.io/")

# to stop browser on notebook close we can add a shutdown hook:
def shutdown_playwright():
    await browser.close()
    await pw.stop()
import atexit
atexit.register(shutdown_playwright())
Question tagged: Playwright, Jupyter

Related Posts

How to Scrape With Headless Firefox

Discover how to use headless Firefox with Selenium, Playwright, and Puppeteer for web scraping, including practical examples for each library.

Web Scraping Dynamic Websites With Scrapy Playwright

Learn about Selenium Playwright. A Scrapy integration that allows web scraping dynamic web pages with Scrapy. We'll explain web scraping with Scrapy Playwright through an example project and how to use it for common scraping use cases, such as clicking elements, scrolling and waiting for elements.

How to Use Chrome Extensions with Playwright, Puppeteer and Selenium

In this article, we'll explore different useful Chrome extensions for web scraping. We'll also explain how to install Chrome extensions with various headless browser libraries, such as Selenium, Playwright and Puppeteer.