Getting started with Puppeteer Stealth

Puppeteer stealth is a popular extension for the Puppeteer browser automation framework. This plugin patches Puppeteer runtime to be less likely to be detected by anti-scraping detection techniques.

Using puppeteer-stealth scrapers have better chance at bypassing Cloudflare, Datadome and other popular anti scraping services.

puppeteer-stealth can be installed using NPM:

$ npm install puppeteer puppeteer-extra puppeteer-extra-plugin-stealth
# or
$ yarn add puppeteer puppeteer-extra puppeteer-extra-plugin-stealth

Then the StealthPlugin object needs to be attached to enable the extension:

// Note: import puppeteer-extra rather than puppeteer
const puppeteer = require('puppeteer-extra')

// add stealth plugin and use defaults (all evasion techniques)
const StealthPlugin = require('puppeteer-extra-plugin-stealth')
puppeteer.use(StealthPlugin())

// test run - check scrapfly.io browser fingerprint page
puppeteer.launch({ headless: true }).then(async browser => {
  console.log('Running tests..')
  const page = await browser.newPage()
  await page.goto('https://scrapfly.io/web-scraping-tools/browser-fingerprint')
  await page.waitForTimeout(5000)
  await page.screenshot({ path: 'testresult.png', fullPage: true })
  await browser.close()
  console.log(`All done, check the screenshot. ✨`)
})

Note that puppeteer-stealth features many patches for different detection techniques that can be customized and extended.

Alternatively, Scrapfly API automatically bypasses anti scraping protections using anti scraping protection bypass feature

Provided by Scrapfly

This knowledgebase is provided by Scrapfly — a web scraping API that allows you to scrape any website without getting blocked and implements a dozens of other web scraping conveniences. Check us out 👇