How to scroll to the bottom of the page with Playwright?

by scrapecrow Jun 30, 2023

When web scraping, it's common to encounter infinite scroll pages. These web pages require scrolling to the end of the page to load more content.

In this guide, we'll explore how to scroll to the bottom of the page with Playwright using three distinct approaches for both Python and NodeJS clients.

Using JavaScript

In order to allow Playwright scroll to bottom, we can use the window.scrollTo(x, y) JavaScript function. This enables vertical scrolling untill the very bottom of the page is reached.

Here's how to use Playwright to infinite scroll web pages. We'll scrape web-scraping.dev/testimonials, which loads more data with scrolls:

Python
NodeJS
from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch(headless=False)
    context = browser.new_context()
    page = context.new_page()
    # navigate to the website
    page.goto("https://web-scraping.dev/testimonials/")

    # scroll to the bottom:
    _prev_height = -1
    _max_scrolls = 100
    _scroll_count = 0
    while _scroll_count < _max_scrolls:
        # Execute JavaScript to scroll to the bottom of the page
        page.evaluate("window.scrollTo(0, document.body.scrollHeight)")
        # Wait for new content to load (change this value as needed)
        page.wait_for_timeout(1000) # wait for 1000 milliseconds
        # Check whether the scroll height changed - means more pages are there
        new_height = page.evaluate("document.body.scrollHeight")
        if new_height == _prev_height:
            break
        _prev_height = new_height
        _scroll_count += 1
        
    # Now we can collect all loaded data on the document:
    results = []
    for element in page.locator(".testimonial").element_handles():
        text = element.query_selector(".text").inner_html()
        results.append(text)
    print(f"scraped: {len(results)} results!")
const { chromium } = require('playwright');

(async () => {
  const browser = await chromium.launch({ headless: false });
  const context = await browser.newContext();
  const page = await context.newPage();
  // navigate to the website
  await page.goto('https://web-scraping.dev/testimonials/');

  // Scroll to the bottom:
  let prevHeight = -1;
  const maxScrolls = 100;
  let scrollCount = 0;
  
  while (scrollCount < maxScrolls) {
    // Execute JavaScript to scroll to the bottom of the page
    await page.evaluate(() => window.scrollTo(0, document.body.scrollHeight));
    // Wait for new content to load (change this value as needed)
    await page.waitForTimeout(1000); // wait for 1000 milliseconds
    // Check whether the scroll height changed - means more pages are there
    const newHeight = await page.evaluate(() => document.body.scrollHeight);
    if (newHeight === prevHeight) {
      break;
    }
    prevHeight = newHeight;
    scrollCount++;
  }
  
  // Now we can collect all loaded data on the document:
  const results = await page.$$eval('.testimonial .text', elements =>
    elements.map(element => element.innerHTML)
  );
  console.log(`scraped: ${results.length} results!`);
  console.log(results);

  await browser.close();
})();

Above, we're scraping an endless paging example from web-scraping.dev.
We start a while loop and keep scrolling to the bottom until the browser's vertical size stops changing.
Then, once the bottom is reached we can start parsing the content.

Above, we define three variables:

  • _prev_height: page height before scrolling to compare
  • _max_scrolls: maximum number of scrolls to perform
  • _scroll_count: current number of scrolls performed

Then, we start a while loop to keep executing the window.scrollTo JavaScript method to scroll down until no new page height is captured. Finally, the full HTML page is parsed once it finishes scrolling vertically.

Using Keyboard

In the previous snippet, we used JavaScript evaluation to emulate scroll action. Since Playwright provides a Keyboard API, we can use it to simulate vertical scrolling:

Python
NodeJS
# ....
    while _scroll_count < _max_scrolls:
        # Scroll to the bottom of the page using keyboard
        page.keyboard.down('End')
        # ....
# ....
  while (scrollCount < maxScrolls) {
    // Scroll to the bottom of the page using keyboard
    await page.keyboard.down('End');
    # ....
  }

Above, we use the keyboard API via the keyboard class to hold the down key till the page ends.

Using Mouse

An obvious way to handle infinite scroll pages is through mouse usage. For this, we can utilize Playwright's mouse API:

Python
NodeJS
# ....
    while _scroll_count < _max_scrolls:
        # Scroll to the bottom of the page using mouse wheel
        page.mouse.wheel(0, 15000)
        # ....
# ....
  while (scrollCount < maxScrolls) {
    // Scroll to the bottom of the page using mouse wheel
    await page.mouse.wheel(0, 15000);
    # ....
  }

Above, we use the mouse class to scroll vertically using a mouse wheel event with the required height length.

For further details on web scraping with Playwright, refer to our dedicated guide.

Web Scraping with Playwright and Python

Playwright is the new, big browser automation toolkit - can it be used for web scraping? In this introduction article, we'll take a look how can we use Playwright and Python to scrape dynamic websites.

Web Scraping with Playwright and Python

Related Articles

Bypass Proxy Detection with Browser Fingerprint Impersonation

Stop proxy blocks with browser fingerprint impersonation using this guide for Playwright, Selenium, curl-impersonate & Scrapfly

PROXIES
SELENIUM
PLAYWRIGHT
PUPPETEER
BLOCKING
Bypass Proxy Detection with Browser Fingerprint Impersonation

Playwright Examples for Web Scraping and Automation

Learn Playwright with Python and JavaScript examples for automating browsers like Chromium, WebKit, and Firefox.

PLAYWRIGHT
PYTHON
NODEJS
Playwright Examples for Web Scraping and Automation

Web Scraping with Playwright and JavaScript

Learn about Playwright - a browser automation toolkit for server side Javascript like NodeJS, Deno or Bun.

PLAYWRIGHT
HEADLESS-BROWSER
NODEJS
Web Scraping with Playwright and JavaScript

Playwright vs Selenium

Explore the key differences between Playwright vs Selenium in terms of performance, web scraping, and automation testing for modern web applications.

HEADLESS-BROWSER
PLAYWRIGHT
SELENIUM
Playwright vs Selenium

What is a Headless Browser? Top 5 Headless Browser Tools

Quick overview of new emerging tech of browser automation - what exactly are these tools and how are they used in web scraping?

HEADLESS-BROWSER
PLAYWRIGHT
SELENIUM
PUPPETEER
What is a Headless Browser? Top 5 Headless Browser Tools

How to Scrape With Headless Firefox

Discover how to use headless Firefox with Selenium, Playwright, and Puppeteer for web scraping, including practical examples for each library.

HEADLESS-BROWSER
PUPPETEER
SELENIUM
NODEJS
PLAYWRIGHT
PYTHON
How to Scrape With Headless Firefox