Puppeteer is one of the most popular browser automation tools for web scraping in JavaScript, but it comes with a significant challenge: websites can detect it. Out of the box, Puppeteer leaves behind telltale automation markers that anti-bot systems identify within milliseconds, resulting in blocked requests, CAPTCHAs, and failed scraping jobs.
In this guide, we will cover how websites detect Puppeteer, how to set up the puppeteer-extra-plugin-stealth library. We will also discuss the limitations of stealth approaches and when cloud browser solutions become necessary.
Quick Start: If you want to get started immediately, here is a minimal working stealth setup:
// Install: npm install puppeteer-extra puppeteer-extra-plugin-stealth
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
puppeteer.use(StealthPlugin());
(async () => {
const browser = await puppeteer.launch({ headless: true });
const page = await browser.newPage();
await page.goto('https://web-scraping.dev/products');
console.log(await page.title());
await browser.close();
})();
The rest of this article explains why this works and what else you need for robust anti-detection.
How Websites Detect Puppeteer
Anti-bot systems often decide very early, sometimes within milliseconds, by checking whether the browser environment looks like a normal, user-driven Chrome session.
Here are the most common signals that give Puppeteer away:
Headless browser fingerprints (User-Agent + runtime traits)
- Headless runs often expose differences in User-Agent and other runtime behavior.
- Some setups reveal strings like
HeadlessChromeor other headless-related inconsistencies.
Chrome DevTools Protocol (CDP) side effects
- Puppeteer controls Chrome through CDP, and that can introduce detectable artifacts.
- Examples include timing anomalies, injected script patterns, and subtle serialization/stack behaviors that don’t usually appear in human browsing.
Behavior and interaction patterns
Even if the browser looks right, behavior can be a giveaway:
- No scrolling, no mouse movement
- Consistent delays between actions
- Clicking immediately after load every time
- Missing focus/blur/visibility patterns typical of real users
Context signals combined with browser signals
Many commercial systems score a session using multiple dimensions at once:
- Datacenter/VPN IP reputation
- Unusual viewport sizes
- Language/timezone mismatches
- Cookie/storage history (brand-new profile every run)
You might pass one check and still fail because the overall profile is statistically unlikely.
What Is the Puppeteer Stealth Plugin?
The puppeteer-extra-plugin-stealth is a plugin built on puppeteer-extra that applies evasion patches to make Puppeteer less detectable. It works by using page.evaluateOnNewDocument() to inject patches before any site scripts run, modifying browser properties, spoofing fingerprint values, and removing automation markers.
Each evasion module is a standalone puppeteer-extra plugin. The stealth plugin acts as a convenience wrapper that bundles all modules with sensible defaults. When you call puppeteer.use(StealthPlugin()), all evasions activate automatically.
Here is what each module does:
| Module | What It Does | Detection Signal It Addresses |
|---|---|---|
| chrome.app | Emulates the chrome.app API | Headless Chrome lacks this API; sites check for its presence |
| chrome.csi | Emulates chrome.csi() function | Chrome-specific timing API missing in headless mode |
| chrome.loadTimes | Emulates chrome.loadTimes() | Another Chrome-specific API absent in headless |
| chrome.runtime | Emulates parts of chrome.runtime | Extensions API fingerprinting; headless has different behavior |
| defaultArgs | Removes --enable-automation flag | Chrome flag that explicitly signals automation |
| iframe.contentWindow | Fixes cross-origin iframe detection | contentWindow behaves differently in automated browsers |
| media.codecs | Spoofs supported media codec list | Headless Chrome reports different codec support than real Chrome |
| navigator.hardwareConcurrency | Masks CPU core count | Can reveal server/cloud environments (e.g., 1-2 cores on VPS) |
| navigator.languages | Sets realistic language preferences | Default headless has empty or minimal language list |
| navigator.permissions | Patches permission query behavior | Permissions.query() returns different results in automated browsers |
| navigator.plugins | Simulates standard browser plugins | Headless Chrome has 0 plugins; real Chrome has PDF viewer, etc. |
| navigator.vendor | Sets vendor string to "Google Inc." | Consistency check between vendor and userAgent |
| navigator.webdriver | Removes navigator.webdriver = true | Primary detection signal set to true by default in Puppeteer |
The navigator.webdriver module is the most critical as it addresses the primary detection signal. The user-agent-override module ensures UA consistency across all surfaces including HTTP headers, navigator.userAgent, and Client Hints, preventing mismatch detection.
All modules are enabled by default. You can customize which modules to enable:
// Default: all modules enabled
puppeteer.use(StealthPlugin());
// Custom: disable specific modules
puppeteer.use(StealthPlugin({
enabledEvasions: new Set([
'chrome.app',
'chrome.csi',
'chrome.loadTimes',
'chrome.runtime',
'navigator.webdriver',
// omit modules you want disabled
])
}));
// Query available evasions
const stealth = StealthPlugin();
console.log(stealth.availableEvasions); // Set of all module names
// Use individual module standalone
puppeteer.use(require('puppeteer-extra-plugin-stealth/evasions/navigator.webdriver')());
For most scraping use cases, the default configuration with all modules enabled is the best choice. Disabling individual modules is only useful for debugging detection issues or resolving conflicts with specific sites.
Using the Puppeteer Stealth Plugin
If you're trying to reduce bot detection while using Puppeteer, the Stealth plugin is one of the simplest upgrades you can make. Below is a cleaner, easier walkthrough of how to set it up and use it correctly.
Installation
Install the required packages:
npm install puppeteer-extra puppeteer-extra-plugin-stealth
# or
yarn add puppeteer-extra puppeteer-extra-plugin-stealth
You do not need to install puppeteer separately.
puppeteer-extra acts as a wrapper around Puppeteer and extends it with plugin support.
Basic Setup
Instead of importing the regular puppeteer package, you import puppeteer-extra, then attach the Stealth plugin before launching the browser.
import puppeteer from 'puppeteer-extra';
import StealthPlugin from 'puppeteer-extra-plugin-stealth';
puppeteer.use(StealthPlugin());
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
puppeteer.use(StealthPlugin());
That’s it. Once applied, all browser instances launched through this Puppeteer instance will use stealth evasions automatically.
Complete Working Example
Below is a simple example that launches Chrome with stealth enabled and scrapes product data.
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
puppeteer.use(StealthPlugin());
async function scrapeWithStealth() {
const browser = await puppeteer.launch({
headless: true,
args: [
'--no-sandbox',
'--disable-setuid-sandbox',
]
});
const page = await browser.newPage();
// Use realistic screen dimensions
await page.setViewport({ width: 1920, height: 1080 });
// Navigate to target page
await page.goto('https://web-scraping.dev/products', {
waitUntil: 'domcontentloaded',
timeout: 30000
});
// Wait for products to load
await page.waitForSelector('.product', { timeout: 10000 });
// Extract product information
const products = await page.evaluate(() => {
const items = document.querySelectorAll('.product');
return Array.from(items).map(item => ({
title: item.querySelector('a')?.textContent?.trim(),
price: item.querySelector('.price')?.textContent?.trim()
}));
});
console.log(products);
await browser.close();
return products;
}
scrapeWithStealth();
All evasion modules are applied automatically before any page scripts execute. From here, you can verify that stealth patches are working correctly and layer in additional techniques like proxy rotation and behavioral simulation as needed.
Verifying Stealth Mode
Stealth patches may not work against all detection systems, so testing your configuration against bot detection sites confirms the patches are applying correctly.
Test Site 1: Scrapfly Automation Detector
The Scrapfly Browser Fingerprint Tool checks for navigator.webdriver, CDP artifacts, headless indicators, runtime manipulation, and framework-specific markers. It displays detected/safe/suspicious signal counts with explanations.
Test Site 2: bot.sannysoft.com
This page shows red and green indicators for various fingerprint leaks. Green indicates the check passed, red indicates detection. Key checks include webdriver, Chrome detection, and plugin enumeration.
Test Site 3: arh.antoinevastel.com/bots/areyouheadless
A headless detection test that specifically checks for headless browser indicators.
Run these tests before and after enabling stealth to confirm the patches are working:
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
puppeteer.use(StealthPlugin());
async function delay(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
async function verifyStealthMode() {
const browser = await puppeteer.launch({ headless: true });
const page = await browser.newPage();
// Test 1: scrapfly browser fingerprint tool
await page.goto('https://scrapfly.io/web-scraping-tools/browser-fingerprint');
await delay(3000);
await page.screenshot({ path: 'scrapfly-browser-fingerprint-result.png', fullPage: true });
// Test 2: bot.sannysoft.com
await page.goto('https://bot.sannysoft.com/');
await delay(3000);
await page.screenshot({ path: 'sannysoft-result.png', fullPage: true });
// Test 3: areyouheadless
await page.goto('https://arh.antoinevastel.com/bots/areyouheadless');
await delay(3000);
await page.screenshot({ path: 'areyouheadless-result.png', fullPage: true });
console.log('Screenshots saved. Check results visually.');
await browser.close();
}
verifyStealthMode();
Passing these tests does not guarantee bypass of advanced systems like Cloudflare or DataDome. These use additional detection layers covered in the Limitations section.
If your verification tests pass but production targets still block you, the Limitations section explains why and what to do about it.
Beyond the Stealth Plugin: Additional Techniques
The stealth plugin addresses JavaScript fingerprinting, but robust anti-detection requires additional layers. However, changing signals without coordination creates detectable mismatches.
Proxy Rotation
IP tracking is a primary detection vector. Stealth fixes fingerprinting but does not rotate IPs. Adding a proxy prevents single-IP rate limiting and distributes traffic across addresses.
Add a proxy to Puppeteer via the --proxy-server launch argument:
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
puppeteer.use(StealthPlugin());
async function scrapeWithProxy() {
const browser = await puppeteer.launch({
headless: true,
args: [
'--proxy-server=http://proxy-host:proxy-port',
'--no-sandbox',
]
});
const page = await browser.newPage();
// For authenticated proxies, use page.authenticate
await page.authenticate({
username: 'proxy-username',
password: 'proxy-password'
});
await page.goto('https://web-scraping.dev/products');
console.log(await page.title());
await browser.close();
}
scrapeWithProxy();
For scraping at scale, rotate proxies by launching new browser instances with different proxy configurations. Residential proxies have better trust scores than datacenter proxies for avoiding detection.
Custom Headers and User-Agent Randomization
The stealth plugin handles User-Agent consistency by default, but you may want to customize headers for specific scenarios:
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
puppeteer.use(StealthPlugin());
const userAgents = [
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/145.0.0.0 Safari/537.36',
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/145.0.0.0 Safari/537.36',
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/144.0.0.0 Safari/537.36',
];
async function scrapeWithCustomUA() {
const browser = await puppeteer.launch({ headless: true });
const page = await browser.newPage();
const randomUA = userAgents[Math.floor(Math.random() * userAgents.length)];
const chromeVersion = (randomUA.match(/Chrome\/(\d+)/) || [null, '145'])[1];
// setUserAgent with userAgentMetadata for Client Hints consistency
// CDP requires architecture, bitness, model (and others) in userAgentMetadata
await page.setUserAgent(randomUA, {
brands: [
{ brand: 'Chromium', version: chromeVersion },
{ brand: 'Google Chrome', version: chromeVersion },
{ brand: 'Not=A?Brand', version: '8' }
],
platform: 'Windows',
mobile: false,
platformVersion: '10.0.0',
architecture: 'x86',
bitness: '64',
model: ''
});
// Set additional headers
await page.setExtraHTTPHeaders({
'Accept-Language': 'en-US,en;q=0.9',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8'
});
await page.goto('https://httpbin.dev/headers');
// Log the request headers
console.log(await page.content());
await browser.close();
}
scrapeWithCustomUA();
When you manually override User-Agent, you must also handle Client Hints headers (Sec-CH-UA, Sec-CH-UA-Platform) or you create a mismatch. Puppeteer's page.setUserAgent() accepts an optional second argument, a userAgentMetadata object, that sets Client Hints.
If you call setUserAgent() without this second argument, the Sec-CH-UA headers will not match your custom UA string, creating a detectable mismatch. The stealth plugin's user-agent-override module handles this coordination automatically, so manual override is only needed for specific customization.
Human-Like Behavior Simulation
Add random delays between actions to simulate human browsing patterns:
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
puppeteer.use(StealthPlugin());
function randomDelay(min, max) {
return Math.floor(Math.random() * (max - min + 1) + min);
}
function delay(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
async function scrapeWithHumanBehavior() {
const browser = await puppeteer.launch({ headless: true });
const page = await browser.newPage();
await page.goto('https://web-scraping.dev/login');
// Random delay before interaction
await delay(randomDelay(1000, 3000));
// Simulate typing with realistic delays
await page.type('input[name="username"]', 'user123', { delay: randomDelay(50, 150) });
await page.type('input[name="password"]', 'password', { delay: randomDelay(50, 150) });
// Random delay before clicking
await delay(randomDelay(500, 1500));
await page.click('button[type="submit"]');
// Wait for results
await delay(randomDelay(2000, 4000));
const secretMessage = await page.$eval('#secret-message', el => el.textContent.trim());
console.log('Secret message:', secretMessage);
await browser.close();
}
scrapeWithHumanBehavior();
Random delays help evade detection by avoiding instant clicks and machine-speed typing. Use ~50–150ms per keystroke and ~500–3000ms between actions; randomDelay() keeps sessions from repeating the same timing patterns.
For more advanced mouse movement simulation, the ghost-cursor library provides realistic cursor trajectories.
Blocking Fingerprinting Resources
Use request interception to block analytics, tracking, and fingerprinting scripts that might detect your automation:
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
puppeteer.use(StealthPlugin());
const blockedDomains = [
'google-analytics.com',
'googletagmanager.com',
'facebook.net',
'doubleclick.net',
'hotjar.com',
'newrelic.com',
'datadome.co',
];
async function scrapeWithResourceBlocking() {
const browser = await puppeteer.launch({ headless: true });
const page = await browser.newPage();
await page.setRequestInterception(true);
page.on('request', (request) => {
const url = request.url();
const shouldBlock = blockedDomains.some(domain => url.includes(domain));
if (shouldBlock) {
request.abort();
} else {
request.continue();
}
});
await page.goto('https://web-scraping.dev/products');
console.log(await page.title());
await browser.close();
}
scrapeWithResourceBlocking();
blocking fingerprinting and analytics resources can reduce detection surface and speed up page loads, but it may also break site functionality or change what content renders.
Limitations of Puppeteer Stealth
puppeteer-extra-plugin-stealth helps with common JavaScript fingerprint checks, but it can’t solve everything. Modern anti-bot systems score a session across multiple layers (network, protocol, browser runtime, and behavior). Stealth mainly improves the runtime layer.Below are the practical limits to keep in mind.
Layer 1: Network-Level Signals (IP, TLS/HTTP2, Proxy Artifacts)
Stealth modifies browser APIs, but it does not change how your traffic looks on the network. Many blocks happen before page JavaScript even runs.
Common network signals include:
- IP reputation: datacenter IPs, abused subnets, or shared proxies get flagged quickly
- Proxy artifacts: some proxies introduce unusual TLS behavior or header quirks
- HTTP/2 fingerprints: SETTINGS frames, priority behavior, and other low-level traits can differ across stacks
Layer 2: CDP (Chrome DevTools Protocol) Detection
Puppeteer controls Chrome via CDP, and some vendors try to detect CDP usage directly (not just navigator.webdriver).
Historically, there have been real-world techniques based on DevTools/runtime side effects for example, how certain objects get serialized when specific CDP domains are enabled. Browser engines evolve, so individual tricks may break over time, but the bigger point remains:
- Stealth can patch JavaScript surfaces
- It can’t remove the fact that automation is driving the browser via CDP
Some teams avoid CDP-based automation entirely for this reason, using tools that rely more on OS-level input.
Layer 3: Behavioral + “Coherence” Checks (Everything Must Match)
Even with good fingerprints and a decent proxy, sites can still block based on how the session behaves and whether signals match each other.
Typical coherence mismatches:
- Datacenter IP + “residential-looking” locale/timezone/language
- Unusual viewport/device metrics
- Brand-new profiles every run (no cookies/storage history)
- Too-fast actions, no scrolling, perfectly consistent timing patterns
Troubleshooting Common Issues
If stealth “does nothing,” check these first:
- Import: use
puppeteer-extra, notpuppeteer - Order: call
puppeteer.use(StealthPlugin())beforelaunch() - Headless mode: prefer modern headless (
headless: true). Avoid legacy/shell modes unless you know why - Launch flags: remove suspicious/conflicting args (some flags change surfaces stealth expects)
- Profile hygiene: don’t always start from a totally empty profile if the target expects returning users
- Verify on a test page: run bot.sannysoft.com (or similar) to confirm patches are applied
- Docker/Linux deps: missing fonts/libs can create weird rendering/fingerprint differences
If all of the above looks correct and you’re still getting blocked, it’s usually network reputation, CDP-level detection, or behavioral scoring, areas stealth can’t fully address.
Avoiding Detection at Scale with Scrapfly
Scrapfly Cloud Browser reduces or outsources each layer from the limitations section. Network-level signals are managed at the infrastructure level. The cloud browser architecture handles CDP considerations.
ScrapFly provides web scraping, screenshot, and extraction APIs for data collection at scale. Each product is equipped with an automatic bypass for any anti-bot system and we achieve this by:
- Maintaining a fleet of real, reinforced web browsers with real fingerprint profiles.
- Millions of self-healing proxies of the highest possible trust score.
- Constantly evolving and adapting to new anti-bot systems.
- We've been doing this publicly since 2020 with the best bypass on the market!
The key advantage for Puppeteer developers: you keep your existing code. Connect via WebSocket CDP:
const puppeteer = require('puppeteer-core');
async function scrapeWithScrapfly() {
const browser = await puppeteer.connect({
browserWSEndpoint: 'wss://browser.scrapfly.io?api_key=YOUR_API_KEY&proxy_pool=residential&os=windows'
});
const page = await browser.newPage();
await page.goto('https://web-scraping.dev/products', {
waitUntil: 'domcontentloaded'
});
await page.waitForSelector('.product', { timeout: 10000 });
const products = await page.evaluate(() => {
const items = document.querySelectorAll('.product');
return Array.from(items).map(item => ({
title: item.querySelector('a')?.textContent?.trim(),
price: item.querySelector('.price')?.textContent?.trim()
}));
});
console.log(JSON.stringify(products, null, 2));
await browser.close();
return products;
}
scrapeWithScrapfly();
The migration is straightforward: swap puppeteer.launch() for puppeteer.connect(). The rest of your Puppeteer code stays the same. No stealth plugin needed. No local Chrome instance. The cloud browser handles fingerprinting.
Not every scraping task needs a browser. For simple pages or basic JS rendering, use Scrapfly Scrape API with asp=true and render_js=true via HTTP. Use Cloud Browser only when you need full Puppeteer-style automation.
FAQ
To wrap up our guide on Puppeteer Stealth, here are practical answers to common questions (and what actually matters in real-world scraping).
What is Puppeteer stealth?
Puppeteer stealth usually refers to using puppeteer-extra-plugin-stealth, a bundle of small evasion patches that reduce how often pages can identify automated Chrome. It mainly works by injecting scripts early (evaluateOnNewDocument) to fix common JavaScript fingerprint leaks.
How to make Puppeteer undetectable?
You generally can’t make Puppeteer truly undetectable against modern anti-bot stacks; the realistic goal is reduce detection rate and avoid obvious mismatches.
A robust setup typically includes:
- Stealth patches (runtime/fingerprint layer)
- Clean IP strategy (quality proxies + sane rotation + low error rates)
- Coherence (locale/timezone/language/UA/viewport match each other and match the IP geography)
- Behavior (don’t click instantly; add scrolling, pauses, realistic typing; avoid perfectly repeated timings)
- Stability (reuse profiles when appropriate; don’t look like a brand-new user every single run)
If a site is heavily protected Cloudflare, you’ll usually need better network/browser infrastructure often a cloud browser.
What is the difference between Puppeteer and Playwright stealth?
The core idea is the same: patch common runtime signals and avoid automation-only defaults. In Puppeteer, the most common approach is puppeteer-extra-plugin-stealth. In Playwright ecosystems, people often use playwright-extra plus stealth-style plugins.
In practice, the limiting factors are usually not the library, but:
- IP/network reputation and TLS/HTTP2 fingerprinting
- Behavioral scoring and interaction realism
- Session history (cookies/storage/profile continuity)
- Automation artifacts (including CDP-related side effects)
What does the Puppeteer stealth plugin actually change?
It applies a set of small patches that target known automation checks, such as:
- Removing/altering
navigator.webdriver - Filling in missing
navigator.languages,navigator.plugins,chrome.*APIs - Fixing common headless-only differences (codecs, permissions behavior, iframe quirks)
- Adjusting some launch arguments that are easy “automation flags”
You can inspect what it provides and selectively enable/disable evasions:
const stealth = StealthPlugin();
console.log(stealth.availableEvasions);
Selective disabling is mostly useful for debugging a broken site, not as a default strategy.
Is puppeteer-extra-plugin-stealth still working?
It still helps against basic-to-mid fingerprint checks and many “off-the-shelf” detectors. However, it’s not a guaranteed bypass for advanced systems that combine multiple layers (network reputation, behavioral scoring, session risk models, and deeper automation artifacts).
A good way to think about it: stealth improves your baseline, but it doesn’t replace proper IP/session/behavior strategy.
Conclusion
This guide covered how websites detect Puppeteer through automation markers and fingerprinting, how to configure the stealth plugin with many evasion modules, complementary techniques including proxy rotation and behavioral simulation, and verification methods to confirm your setup works.
Try the techniques in this guide for your scraping projects. When you encounter sites that require more robust anti-detection, or when scaling makes local browser management impractical, explore Scrapfly Cloud Browser to offload anti-detection to infrastructure while keeping your existing Puppeteer code.