How to take screenshots in NodeJS?

Puppeteer and Playwright are popular headless browser libraries for NodeJS, and one of their use cases is screenshot automation. In this guide, we'll explore using Playwright and Puppeteer to screenshot in NodeJS. We'll start by covering installation, core concepts, and common functionalities to customize website screenshots. Let's get started!

Installation

To start, let's go over the installation process. Puppeteer and NodeJS Playwright can be installed using the below npm command:

npm install puppeteer playwright

Next, install Playwrights' web driver binaries using the below command:

npx install chromium # alternatively install `firefox` or `webkit`

The Basics

To start, let's explore the basics. We can use the screenshot method to take Playwright and Puppeteer screenshots in NodeJS:

Puppeteer
Playwright
const puppeteer = require("puppeteer");

async function run() {
  // launch a new page
  const browser = await puppeteer.launch({
    headless: false
  });
  const page = await browser.newPage();

  // go to the target web page
  await page.goto("https://web-scraping.dev/products");

  // take page screenshot
  await page.screenshot({
    type: "png", // can also be "jpeg" or "webp" (recommended)
    path: "products.png", // save image data to a PNG file
  });

  browser.close();
}

run();
const { chromium } = require("playwright");

async function run() {
  // launch a new browser tab with empty context
  const browser = await chromium.launch({ headless: false });
  const context = await browser.newContext();
  const page = await context.newPage();

  // go to the target web page
  await page.goto("https://web-scraping.dev/products");

  // take page screenshot
  await page.screenshot({ path: "products.png" });

  await browser.close();
}

run();

In the above code, we start by launching a headless browser instance and navigating to the target page URL. Then, we take Playwright and Puppeteer screenshots using the same screenshot method.

Waits and Timeouts

Utilizing browser timeouts is crucial to ensure the data to screenshot has fully loaded before we capture a screenshot. For this, we can utilize different waiting strategies before taking screenshots:

Puppeteer
Playwright
async function run() {
  // ...

  // go to the target web page
  await page.goto("https://web-scraping.dev/products", {
    // wait for specific load state
    waitUntil: "networkidle2", //wait for network state to be idle
    waitUntil: "domcontentloaded", // wait for DOM tree to load
    waitUntil: "load", // wait for all respurces to load, including CSS and images (default)
  });
  // ....
}
async function run() {
  // ...

  // go to the target web page
  await page.goto('https://web-scraping.dev/products', {
    // wait for specific load state
    waitUntil: 'networkidle', //wait for network state to be idle
    waitUntil: 'domcontentloaded', // wait for DOM tree to load
    waitUntil: 'load', // wait for all resources to load, including CSS and images (default)
  });
  // ...
}

Here, we use the waitUntil method to wait for a specific load state before proceeding with the rest of the program, which ensures images load correctly before taking Puppeteer and Playwright NodeJS screenshot.

Alternatively, we can wait for a specific CSS or XPath selector to be present:

Puppeteer
Playwright
  await page.waitForSelector("div.products", { timeout: 10000 }); // CSS
  await page.waitForSelector("xpath/" + "//div[@class='products']", {
    timeout: 10000,
  }); // XPath
  await page.waitForSelector('div.products', { timeout: 10000 }); // CSS
  await page.waitForSelector("//div[@class='products']", {
    timeout: 10000,
  }); // XPath

Finally, we can use fixed wait conditions:

Puppeteer
Playwright
  // wait for fixed timeout
  await new Promise((resolve) => setTimeout(resolve, 5000)); // 5 seconds
  // wait for fixed timeout
  await page.waitForTimeout(5000); // 5 seconds

Since Puppeteer doesn't natively support waiting for fixed waiting methods, we emulate it using promises. As for Playwright, we use the built-in waitForTimeout method to wait for a fixed timeout.

Note that it's not recommended to use fixed waiting methods when capturing screenshots in Node.js, as they often add unnecessary latency.

Viewport

One key configuration to consider when taking NodeJS web page screenshots is the browser window viewport. It represents the web browser resolution through width and height dimensions:

Puppeteer
Playwright
const browser = await puppeteer.launch({
  headless: false,
  args: ["--window-size=1920,1080"],
});
const page = await browser.newPage();
await page.setViewport({
  width: 1920,
  height: 1080,
});
const browser = await chromium.launch({
  headless: false,
});
const context = await browser.newContext({
  viewport: { width: 1920, height: 1080 },
});
const page = await context.newPage();

Here, we set 1080p resolution using width and height values. Manipulating the viewport enables emulating different devices. For instance, Playwright provides a wide range of device presets to emulate different web browsers and operating systems for further customization while taking a NodeJS screenshot:

const { chromium, devices } = require('playwright');

  const browser = await chromium.launch({
    headless: false
  });
  const iphone13 = devices['iPhone 14 Pro Max'];
  const context = await browser.newContext({
    ...iphone13,
  });
  const page = await context .newPage();

Above, we emulate a mobile browser by selecting a device preset. Playwright will then automatically apply the selected device UseAgent, viewport, and scale factor settings. For the full list of available device profiles, refer to the official device registry.

Selection Targeting

When taking web page screenshots, it's often convenient to fit the image based on the requirements, and this is where selection targeting comes in handy!

Full Page

A common use case is taking full web page screenshots. Here's how to approach it in NodeJS:

Puppeteer
Playwright
const puppeteer = require("puppeteer");

async function scroll(page) {
  let prevHeight = -1;
  let maxScrolls = 100;
  let scrollCount = 0;

  while (scrollCount < maxScrolls) {
    // scroll to the bottom of the page
    await page.evaluate("window.scrollTo(0, document.body.scrollHeight)");
    // wait for new scroll to finish
    await new Promise((resolve) => setTimeout(resolve, 2000));
    // calculate new scroll height and compare
    let newHeight = await page.evaluate("document.body.scrollHeight");
    if (newHeight == prevHeight) {
      break;
    }
    prevHeight = newHeight;
    scrollCount += 1;
  }
}

async function run() {
  const browser = await puppeteer.launch({
    headless: false,
    args: ["--window-size=1920,1080"],
  });
  const page = await browser.newPage();
  await page.setViewport({
    width: 1920,
    height: 1080,
  });
    
  // go to the target web page
  await page.goto("https://web-scraping.dev/testimonials", {
    waitUntil: "load",
  });

  // scroll down to the end of the page
  await scroll(page);

  await page.screenshot({
    type: "png",
    path: "full-page-screenshot.png",
    fullPage: true,
    captureBeyondViewport: false, // prevent image flicking
  });

  browser.close();
}

run();
const { chromium } = require("playwright");

async function scroll(page) {
  let prevHeight = -1;
  let maxScrolls = 100;
  let scrollCount = 0;

  while (scrollCount < maxScrolls) {
    // scroll to the bottom of the page
    await page.evaluate("window.scrollTo(0, document.body.scrollHeight)");
    // wait for new scroll to finish
    await page.waitForTimeout(2000);
    // calculate new scroll height and compare
    let newHeight = await page.evaluate("document.body.scrollHeight");
    if (newHeight == prevHeight) {
      break;
    }
    prevHeight = newHeight;
    scrollCount += 1;
  }
}

async function run() {
  const browser = await chromium.launch({
    headless: false,
  });
  const context = await browser.newContext({});
  const page = await context.newPage();

  // go to the target web page
  await page.goto("https://web-scraping.dev/products", {
    waitUntil: "load",
  });

  // scroll down to the end of the page
  await scroll(page);

  await page.screenshot({
    type: "png",
    path: "full-page-screenshot.png",
    fullPage: true,
  });

  await browser.close();
}

run();

Here, we take a Node.js full page screenshot on web-scraping.dev/testimonials, which uses infinite scrolling to fetch more data. The headless browser starts by navigating to the target web page and scrolling till the page end. Then, we use the fullpage option to capture a webpage screen the whole browser viewport.

Selectors

For further NodeJS screenshot customization, we can capture screenshots of a particular HTML element on the HTML using their equivalent selectors:

Puppeteer
Playwright
async function run() {
  // launch a new browser tab
  const browser = await puppeteer.launch({
    headless: false,
    args: ["--window-size=1920,1080"],
  });
  const page = await browser.newPage();
  await page.setViewport({ width: 1920, height: 1080 });

  // request the web page and wait for the target element to load
  await page.goto("https://web-scraping.dev/product/3");
  await page.waitForSelector("div.row.product-data")

  const element = await page.$('div.row.product-data');

  await element.screenshot({
    type: "png",
    path: "element-screenshot.png",
  });
async function run() {
  const browser = await chromium.launch({
    headless: false,
  });
  const context = await browser.newContext({
    viewport: { width: 1920, height: 1080 },
  });
  const page = await context.newPage();

  // request the web page and wait for the target element to load
  await page.goto("https://web-scraping.dev/product/3");
  await page.waitForSelector("div.row.product-data")

  // select the element and capture it
  const element = await page.$('div.row.product-data');
  await element.screenshot({
    path: "element-screenshot.png"
  })
  await browser.close();
}

Here, we take Playwright and Puppeteer to take a screenshot of a specific element on the web page through the following steps:

  • Wait for the element to appear
  • Select the element using its CSS selector
  • Screenshot the selected element

Provided by Scrapfly

This knowledgebase is provided by Scrapfly — a web scraping API that allows you to scrape any website without getting blocked and implements a dozens of other web scraping conveniences. Check us out 👇