How to run Playwright in Jupyter notebooks?

by scrapecrow Dec 19, 2022

The Playwright package is a popular web browser automation tool in Python, which can be run in Jupyter notebooks for quick web scraping scripts. However, since Jupyter notebooks runs its own asyncio loops, we cannot start the synchronous playwright client: :

# in Jupyter:

from playwright.sync_api import sync_playwright
playwright = sync_playwright().start()

"""
Error: It looks like you are using Playwright Sync API inside the asyncio loop.
Please use the Async API instead.
"""

The reason behind the above error is that there's an already running event loop. To use Playwright in Jupyter notebooks, we should explicitly use the asynchronous Playwright client using the following code:

# in Jupyter:

import nest_asyncio
import asyncio
from playwright.async_api import async_playwright
import atexit

# Allow nested event loops
nest_asyncio.apply()

async def main():
    pw = await async_playwright().start()
    browser = await pw.chromium.launch(headless=True)
    page = await browser.new_page()

    # All methods are async (use the "await" keyword)
    await page.goto("https://web-scraping.dev")
    src = await page.content()
    print(src)
    # Function to close browser and stop Playwright
    async def shutdown_playwright():
        await browser.close()
        await pw.stop()

    # Register shutdown hook for when the program exits
    atexit.register(lambda: asyncio.run(shutdown_playwright()))

# Run the async main function
await main()  # Use await directly instead of asyncio.run()

Here, we use playwright's async API and wrap it to the main function. Then, we execute in a nested asynchronous event loop using nest_asyncio. Note that the above snippet allows running Playwright in Google Colab since it shares the same concept as Jupyter notebooks.

For further details on web scraping with Playwright, refer to our dedicated guide.

Web Scraping with Playwright and Python

Playwright is the new, big browser automation toolkit - can it be used for web scraping? In this introduction article, we'll take a look how can we use Playwright and Python to scrape dynamic websites.

Web Scraping with Playwright and Python

Related Articles

Bypass Proxy Detection with Browser Fingerprint Impersonation

Stop proxy blocks with browser fingerprint impersonation using this guide for Playwright, Selenium, curl-impersonate & Scrapfly

PROXIES
SELENIUM
PLAYWRIGHT
PUPPETEER
BLOCKING
Bypass Proxy Detection with Browser Fingerprint Impersonation

Playwright Examples for Web Scraping and Automation

Learn Playwright with Python and JavaScript examples for automating browsers like Chromium, WebKit, and Firefox.

PLAYWRIGHT
PYTHON
NODEJS
Playwright Examples for Web Scraping and Automation

Web Scraping with Playwright and JavaScript

Learn about Playwright - a browser automation toolkit for server side Javascript like NodeJS, Deno or Bun.

PLAYWRIGHT
HEADLESS-BROWSER
NODEJS
Web Scraping with Playwright and JavaScript

Playwright vs Selenium

Explore the key differences between Playwright vs Selenium in terms of performance, web scraping, and automation testing for modern web applications.

HEADLESS-BROWSER
PLAYWRIGHT
SELENIUM
Playwright vs Selenium

What is a Headless Browser? Top 5 Headless Browser Tools

Quick overview of new emerging tech of browser automation - what exactly are these tools and how are they used in web scraping?

HEADLESS-BROWSER
PLAYWRIGHT
SELENIUM
PUPPETEER
What is a Headless Browser? Top 5 Headless Browser Tools

How to Scrape With Headless Firefox

Discover how to use headless Firefox with Selenium, Playwright, and Puppeteer for web scraping, including practical examples for each library.

HEADLESS-BROWSER
PUPPETEER
SELENIUM
NODEJS
PLAYWRIGHT
PYTHON
How to Scrape With Headless Firefox