How to scroll to the bottom of the page with Playwright?

When web scraping, it's common to encounter infinite scroll pages. These web pages require scrolling to the end of the page to load more content.

In this guide, we'll explore how to scroll to the bottom of the page with Playwright using three distinct approaches for both Python and NodeJS clients.

Using JavaScript

In order to allow Playwright scroll to bottom, we can use the window.scrollTo(x, y) JavaScript function. This enables vertical scrolling untill the very bottom of the page is reached.

Here's how to use Playwright to infinite scroll web pages. We'll scrape web-scraping.dev/testimonials, which loads more data with scrolls:

Python
NodeJS
from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch(headless=False)
    context = browser.new_context()
    page = context.new_page()
    # navigate to the website
    page.goto("https://web-scraping.dev/testimonials/")

    # scroll to the bottom:
    _prev_height = -1
    _max_scrolls = 100
    _scroll_count = 0
    while _scroll_count < _max_scrolls:
        # Execute JavaScript to scroll to the bottom of the page
        page.evaluate("window.scrollTo(0, document.body.scrollHeight)")
        # Wait for new content to load (change this value as needed)
        page.wait_for_timeout(1000) # wait for 1000 milliseconds
        # Check whether the scroll height changed - means more pages are there
        new_height = page.evaluate("document.body.scrollHeight")
        if new_height == _prev_height:
            break
        _prev_height = new_height
        _scroll_count += 1
        
    # Now we can collect all loaded data on the document:
    results = []
    for element in page.locator(".testimonial").element_handles():
        text = element.query_selector(".text").inner_html()
        results.append(text)
    print(f"scraped: {len(results)} results!")
const { chromium } = require('playwright');

(async () => {
  const browser = await chromium.launch({ headless: false });
  const context = await browser.newContext();
  const page = await context.newPage();
  // navigate to the website
  await page.goto('https://web-scraping.dev/testimonials/');

  // Scroll to the bottom:
  let prevHeight = -1;
  const maxScrolls = 100;
  let scrollCount = 0;
  
  while (scrollCount < maxScrolls) {
    // Execute JavaScript to scroll to the bottom of the page
    await page.evaluate(() => window.scrollTo(0, document.body.scrollHeight));
    // Wait for new content to load (change this value as needed)
    await page.waitForTimeout(1000); // wait for 1000 milliseconds
    // Check whether the scroll height changed - means more pages are there
    const newHeight = await page.evaluate(() => document.body.scrollHeight);
    if (newHeight === prevHeight) {
      break;
    }
    prevHeight = newHeight;
    scrollCount++;
  }
  
  // Now we can collect all loaded data on the document:
  const results = await page.$$eval('.testimonial .text', elements =>
    elements.map(element => element.innerHTML)
  );
  console.log(`scraped: ${results.length} results!`);
  console.log(results);

  await browser.close();
})();

Above, we're scraping an endless paging example from web-scraping.dev.
We start a while loop and keep scrolling to the bottom until the browser's vertical size stops changing.
Then, once the bottom is reached we can start parsing the content.

Above, we define three variables:

  • _prev_height: page height before scrolling to compare
  • _max_scrolls: maximum number of scrolls to perform
  • _scroll_count: current number of scrolls performed

Then, we start a while loop to keep executing the window.scrollTo JavaScript method to scroll down until no new page height is captured. Finally, the full HTML page is parsed once it finishes scrolling vertically.

Using Keyboard

In the previous snippet, we used JavaScript evaluation to emulate scroll action. Since Playwright provides a Keyboard API, we can use it to simulate vertical scrolling:

Python
NodeJS
    # ....
    while _scroll_count < _max_scrolls:
        # Scroll to the bottom of the page using keyboard
        page.keyboard.down('End')
        # ....
  # ....
  while (scrollCount < maxScrolls) {
    // Scroll to the bottom of the page using keyboard
    await page.keyboard.down('End');
    # ....
  }

Above, we use the keyboard API via the keyboard class to hold the down key till the page ends.

Using Mouse

An obvious way to handle infinite scroll pages is through mouse usage. For this, we can utilize Playwright's mouse API:

Python
NodeJS
    # ....
    while _scroll_count < _max_scrolls:
        # Scroll to the bottom of the page using mouse wheel
        page.mouse.wheel(0, 15000)
        # ....
 # ....
  while (scrollCount < maxScrolls) {
    // Scroll to the bottom of the page using mouse wheel
    await page.mouse.wheel(0, 15000);
    # ....
  }

Above, we use the mouse class to scroll vertically using a mouse wheel event with the required height length.

For further details on web scraping with Playwright, refer to our dedicated guide.

Web Scraping with Playwright and Python

Learn how to use Playwright for common tasks like browser navigation, button clicking, text input, and data parsing tools. You will also learn advanced techniques like JavaScript evaluation, resource interception, and blocking.

Web Scraping with Playwright and Python

Provided by Scrapfly

This knowledgebase is provided by Scrapfly data APIs, check us out! 👇