How to wait for page to load in Playwright?

by scrapecrow Nov 03, 2022

When scraping dynamic web pages with Playwright and Python we need to wait for the page to fully load before we retrieve the page source for HTML parsing. Let's explore multiple load event methods to ensure a full web page load!

Selectors

In order to make Playwright wait for page to load, we can use Playwright's wait_for_selector method. It ensures a full page load state waiting for specific elements to appear on the web page:

with sync_playwright() as pw:
    browser = pw.chromium.launch(headless=False)
    context = browser.new_context(viewport={"width": 1920, "height": 1080})
    page = context.new_page()

    # go to url
    page.goto("https://web-scraping.dev/products")
    # wait for element to appear on the page:
    page.wait_for_selector("div.products")
    # get HTML
    print(page.content())

Above, we start by creating a new browser context, navigate to the target web page, and wait for the CSS selector div.products to be visible. Finally, we return the page object once it's fully loaded.

Fixed Timeouts

The seconds waiting process is the wait_for_timeout method. Unlike the previous method, this approach doesn't utilize locating elements on the document. Instead, it instructs the browser to wait for a fixed time:

page.goto("https://web-scraping.dev/products")
    page.wait_for_timeout(5000)

Here, we use the wait_for_timeout to explicitly wait for 5 seconds before executing the remaining script actions.

Rendering State

The latest load event we'll use is the wait_for_load_state, it relies on different states of network connections:

  • domcontentloaded: Waits for the initial DOM structure to be present without waiting for static files to finish loading.
  • networkidle: Waits for all background network operations to be finished loading and is idle for at least 500 milliseconds.
  • load: Waits the HTML document and its static files to be fully loaded.

Here's how to use the wait_for_load_state to let Playwright wait for page to load through different states:

page.goto("https://web-scraping.dev/products")
    PageMethod("wait_for_load_state", "domcontentloaded"),
    PageMethod("wait_for_load_state", "networkidle"),
    PageMethod("wait_for_load_state", "load"),

For further details on web scraping with Playwright, refer to our dedicated guide.

Web Scraping with Playwright and Python

Playwright is the new, big browser automation toolkit - can it be used for web scraping? In this introduction article, we'll take a look how can we use Playwright and Python to scrape dynamic websites.

Web Scraping with Playwright and Python

Related Articles

Playwright Examples for Web Scraping and Automation

Learn Playwright with Python and JavaScript examples for automating browsers like Chromium, WebKit, and Firefox.

PLAYWRIGHT
PYTHON
NODEJS
Playwright Examples for Web Scraping and Automation

How to Scrape With Headless Firefox

Discover how to use headless Firefox with Selenium, Playwright, and Puppeteer for web scraping, including practical examples for each library.

HEADLESS-BROWSER
PUPPETEER
SELENIUM
NODEJS
PLAYWRIGHT
PYTHON
How to Scrape With Headless Firefox

Web Scraping Dynamic Websites With Scrapy Playwright

Learn about Selenium Playwright. A Scrapy integration that allows web scraping dynamic web pages with Scrapy. We'll explain web scraping with Scrapy Playwright through an example project and how to use it for common scraping use cases, such as clicking elements, scrolling and waiting for elements.

PYTHON
PLAYWRIGHT
SCRAPY
HEADLESS-BROWSER
Web Scraping Dynamic Websites With Scrapy Playwright

How to use Headless Chrome Extensions for Web Scraping

In this article, we'll explore different useful Chrome extensions for web scraping. We'll also explain how to install Chrome extensions with various headless browser libraries, such as Selenium, Playwright and Puppeteer.

PYTHON
NODEJS
TOOLS
PLAYWRIGHT
PUPPETEER
SELENIUM
How to use Headless Chrome Extensions for Web Scraping

How to Scrape Google Maps

We'll take a look at to find businesses through Google Maps search system and how to scrape their details using either Selenium, Playwright or ScrapFly's javascript rendering feature - all of that in Python.

SCRAPEGUIDE
PYTHON
SELENIUM
PLAYWRIGHT
How to Scrape Google Maps

How to Scrape Dynamic Websites Using Headless Web Browsers

Introduction to using web automation tools such as Puppeteer, Playwright, Selenium and ScrapFly to render dynamic websites for web scraping

HEADLESS-BROWSER
PYTHON
SELENIUM
PUPPETEER
PLAYWRIGHT
How to Scrape Dynamic Websites Using Headless Web Browsers