How to scroll to the bottom with Playwright?

When web scraping with Playwright we can encounter pages that require scrolling to the bottom to load more content. This is a common pattern for infinite scrolling pages.

To scroll our browser custom javascript function window.scrollTo(x, y) can be used. This function scrolls the page to the specified coordinates.

So, if we need to scroll to the bottom of the page we can use a while loop to continuously scroll until the bottom is reached.
Let's take a look at an example by scraping web-scraping.dev/testimonials:

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch(headless=False)
    context = browser.new_context()
    page = context.new_page()
    page.goto('https://web-scraping.dev/testimonials/') 

    # scroll to the bottom:
    _prev_height = -1
    _max_scrolls = 100
    _scroll_count = 0
    while _scroll_count < _max_scrolls:
        # Execute JavaScript to scroll to the bottom of the page
        page.evaluate("window.scrollTo(0, document.body.scrollHeight)")
        # Wait for new content to load (change this value as needed)
        page.wait_for_timeout(1000)
        # Check whether the scroll height changed - means more pages are there
        new_height = page.evaluate("document.body.scrollHeight")
        if new_height == _prev_height:
            break
        _prev_height = new_height
        _scroll_count += 1
    # now we can collect all loaded data:
    results = []
    for element in page.locator('.testimonial').element_handles():
        text = element.query_selector('.text').inner_html()
        results.append(text)
    print(f"scraped: {len(results)} results!")

Above, we're scraping an endless paging example from web-scraping.dev.
We start a while loop and keep scrolling to the bottom until the browser vertical size stops changing.
Then, once the bottom is reached we can start parsing the content.

Question tagged: Playwright

Related Posts

How to Scrape With Headless Firefox

Discover how to use headless Firefox with Selenium, Playwright, and Puppeteer for web scraping, including practical examples for each library.

Web Scraping Dynamic Websites With Scrapy Playwright

Learn about Selenium Playwright. A Scrapy integration that allows web scraping dynamic web pages with Scrapy. We'll explain web scraping with Scrapy Playwright through an example project and how to use it for common scraping use cases, such as clicking elements, scrolling and waiting for elements.

How to Use Chrome Extensions with Playwright, Puppeteer and Selenium

In this article, we'll explore different useful Chrome extensions for web scraping. We'll also explain how to install Chrome extensions with various headless browser libraries, such as Selenium, Playwright and Puppeteer.