How to get page source in Puppeteer?

When web scraping, we often want to retrieve full page source (full HTML of the web page) we can parse it for data using tools like Cheerio. Using Puppeteer, to get the page source we can use page.content() method:

const puppeteer = require('puppeteer');

async function run() {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await page.goto("https://httpbin.dev/html");

    let source = await page.content();
    // OR the faster method that doesn't wait for images to load:
    let source = await page.content({"waitUntil": "domcontentloaded"});

    console.log(source);
    browser.close();
}

run();

⚠ It's possible that this command will retrieve page source before the page fully loads if it's a dynamic javascript page. For more see How to wait for a page to load in Puppeteer?

Related Articles

How to Scrape With Headless Firefox

Discover how to use headless Firefox with Selenium, Playwright, and Puppeteer for web scraping, including practical examples for each library.

HEADLESS-BROWSER
PUPPETEER
SELENIUM
NODEJS
PLAYWRIGHT
PYTHON
How to Scrape With Headless Firefox

How to Scrape Dynamic Websites Using Headless Web Browsers

Introduction to using web automation tools such as Puppeteer, Playwright, Selenium and ScrapFly to render dynamic websites for web scraping

HEADLESS-BROWSER
PYTHON
SELENIUM
PUPPETEER
PLAYWRIGHT
How to Scrape Dynamic Websites Using Headless Web Browsers

What is a Headless Browser? Top 5 Headless Browser Tools

Quick overview of new emerging tech of browser automation - what exactly are these tools and how are they used in web scraping?

HEADLESS-BROWSER
PLAYWRIGHT
SELENIUM
PUPPETEER
What is a Headless Browser? Top 5 Headless Browser Tools

How To Take Screenshots In Python?

Learn how to take Python screenshots through Selenium and Playwright, including common browser tips and tricks for customizing web page captures.

SCREENSHOTS
PYTHON
HEADLESS-BROWSER
How To Take Screenshots In Python?

How to Scrape Forms

Learn how to scrape forms through a step-by-step guide using HTTP clients and headless browsers.

HEADLESS-BROWSER
PYTHON
HTTPX
INTRO
NODEJS
How to Scrape Forms

Selenium Wire Tutorial: Intercept Background Requests

In this guide, we'll explore web scraping with Selenium Wire. We'll define what it is, how to install it, and how to use it to inspect and manipulate background requests.

PYTHON
HEADLESS-BROWSER
SELENIUM
TOOLS
Selenium Wire Tutorial: Intercept Background Requests