How to scroll to the bottom of the page with Puppeteer?

When web scraping with Puppeteer we might encounter pages that require scrolling to the bottom to load more content. This is a common pattern for infinite scrolling pages.

To scroll our Puppeteer browser custom javascript function window.scrollTo(x, y) can be used. This function scrolls the page to the specified coordinates.

So, if we need to scroll to the very bottom of the page we can use a while loop to continuously scroll until the bottom is reached.
Let's take a look at an example by scraping web-scraping.dev/testimonials:

const puppeteer = require('puppeteer');

async function scrapeTestimonials() {
    const browser = await puppeteer.launch({headless: false});
    const page = await browser.newPage();

    await page.goto('https://web-scraping.dev/testimonials/');

    let prevHeight = -1;
    let maxScrolls = 100;
    let scrollCount = 0;

    while (scrollCount < maxScrolls) {
        // Scroll to the bottom of the page
        await page.evaluate('window.scrollTo(0, document.body.scrollHeight)');
        // Wait for page load
        await page.waitForTimeout(1000);
        // Calculate new scroll height and compare
        let newHeight = await page.evaluate('document.body.scrollHeight');
        if (newHeight == prevHeight) {
            break;
        }
        prevHeight = newHeight;
        scrollCount += 1;
    }

    // Collect all loaded data
    let elements = await page.$$('.testimonial');
    let results = [];
    for(let element of elements) {
        let text = await element.$eval('.text', node => node.innerHTML);
        results.push(text);
    }

    console.log(`Scraped: ${results.length} results!`);

    await browser.close();
}

scrapeTestimonials();

Above, we're scraping an endless paging example from the web-scraping.dev website.
We start a while loop and keep scrolling to the bottom until the browser's vertical size stops changing.
Then, once the bottom is reached we can start parsing the content.

Question tagged: Puppeteer

Related Posts

How to Use Chrome Extensions with Playwright, Puppeteer and Selenium

In this article, we'll explore different useful Chrome extensions for web scraping. We'll also explain how to install Chrome extensions with various headless browser libraries, such as Selenium, Playwright and Puppeteer.

Web Scraping With a Headless Browser: Puppeteer

Introduction to using Puppeteer in Nodejs for web scraping dynamic web pages and web apps. Tips and tricks, best practices and example project.

How to Scrape Dynamic Websites Using Headless Web Browsers

Introduction to using web automation tools such as Puppeteer, Playwright, Selenium and ScrapFly to render dynamic websites for web scraping