# Scrapfly Documentation

## Table of Contents

### Dashboard

- [Intro](https://scrapfly.io/docs)
- [Project](https://scrapfly.io/docs/project)
- [Account](https://scrapfly.io/docs/account)
- [Workspace & Team](https://scrapfly.io/docs/workspace-and-team)
- [Billing](https://scrapfly.io/docs/billing)

### Products

#### MCP Server

- [Getting Started](https://scrapfly.io/docs/mcp/getting-started)
- [Tools & API Spec](https://scrapfly.io/docs/mcp/tools)
- [Authentication](https://scrapfly.io/docs/mcp/authentication)
- [Examples & Use Cases](https://scrapfly.io/docs/mcp/examples)
- [FAQ](https://scrapfly.io/docs/mcp/faq)
##### Integrations

- [Overview](https://scrapfly.io/docs/mcp/integrations)
- [Claude Desktop](https://scrapfly.io/docs/mcp/integrations/claude-desktop)
- [Claude Code](https://scrapfly.io/docs/mcp/integrations/claude-code)
- [ChatGPT](https://scrapfly.io/docs/mcp/integrations/chatgpt)
- [Cursor](https://scrapfly.io/docs/mcp/integrations/cursor)
- [Cline](https://scrapfly.io/docs/mcp/integrations/cline)
- [Windsurf](https://scrapfly.io/docs/mcp/integrations/windsurf)
- [Zed](https://scrapfly.io/docs/mcp/integrations/zed)
- [Roo Code](https://scrapfly.io/docs/mcp/integrations/roo-code)
- [VS Code](https://scrapfly.io/docs/mcp/integrations/vscode)
- [LangChain](https://scrapfly.io/docs/mcp/integrations/langchain)
- [LlamaIndex](https://scrapfly.io/docs/mcp/integrations/llamaindex)
- [CrewAI](https://scrapfly.io/docs/mcp/integrations/crewai)
- [OpenAI](https://scrapfly.io/docs/mcp/integrations/openai)
- [n8n](https://scrapfly.io/docs/mcp/integrations/n8n)
- [Make](https://scrapfly.io/docs/mcp/integrations/make)
- [Zapier](https://scrapfly.io/docs/mcp/integrations/zapier)
- [Vapi AI](https://scrapfly.io/docs/mcp/integrations/vapi)
- [Agent Builder](https://scrapfly.io/docs/mcp/integrations/agent-builder)
- [Custom Client](https://scrapfly.io/docs/mcp/integrations/custom-client)


#### Web Scraping API

- [Getting Started](https://scrapfly.io/docs/scrape-api/getting-started)
- [API Specification]()
- [Monitoring](https://scrapfly.io/docs/monitoring)
- [Customize Request](https://scrapfly.io/docs/scrape-api/custom)
- [Debug](https://scrapfly.io/docs/scrape-api/debug)
- [Anti Scraping Protection](https://scrapfly.io/docs/scrape-api/anti-scraping-protection)
- [Proxy](https://scrapfly.io/docs/scrape-api/proxy)
- [Proxy Mode](https://scrapfly.io/docs/scrape-api/proxy-mode)
- [Proxy Mode - Screaming Frog](https://scrapfly.io/docs/scrape-api/proxy-mode/screaming-frog)
- [Proxy Mode - Apify](https://scrapfly.io/docs/scrape-api/proxy-mode/apify)
- [(Auto) Data Extraction](https://scrapfly.io/docs/scrape-api/extraction)
- [Javascript Rendering](https://scrapfly.io/docs/scrape-api/javascript-rendering)
- [Javascript Scenario](https://scrapfly.io/docs/scrape-api/javascript-scenario)
- [SSL](https://scrapfly.io/docs/scrape-api/ssl)
- [DNS](https://scrapfly.io/docs/scrape-api/dns)
- [Cache](https://scrapfly.io/docs/scrape-api/cache)
- [Session](https://scrapfly.io/docs/scrape-api/session)
- [Webhook](https://scrapfly.io/docs/scrape-api/webhook)
- [Screenshot](https://scrapfly.io/docs/scrape-api/screenshot)
- [Errors](https://scrapfly.io/docs/scrape-api/errors)
- [Timeout](https://scrapfly.io/docs/scrape-api/understand-timeout)
- [Throttling](https://scrapfly.io/docs/throttling)
- [Troubleshoot](https://scrapfly.io/docs/scrape-api/troubleshoot)
- [Billing](https://scrapfly.io/docs/scrape-api/billing)
- [FAQ](https://scrapfly.io/docs/scrape-api/faq)

#### Crawler API

- [Getting Started](https://scrapfly.io/docs/crawler-api/getting-started)
- [API Specification]()
- [Retrieving Results](https://scrapfly.io/docs/crawler-api/results)
- [WARC Format](https://scrapfly.io/docs/crawler-api/warc-format)
- [Data Extraction](https://scrapfly.io/docs/crawler-api/extraction-rules)
- [Webhook](https://scrapfly.io/docs/crawler-api/webhook)
- [Billing](https://scrapfly.io/docs/crawler-api/billing)
- [Errors](https://scrapfly.io/docs/crawler-api/errors)
- [Troubleshoot](https://scrapfly.io/docs/crawler-api/troubleshoot)
- [FAQ](https://scrapfly.io/docs/crawler-api/faq)

#### Screenshot API

- [Getting Started](https://scrapfly.io/docs/screenshot-api/getting-started)
- [API Specification]()
- [Accessibility Testing](https://scrapfly.io/docs/screenshot-api/accessibility)
- [Webhook](https://scrapfly.io/docs/screenshot-api/webhook)
- [Billing](https://scrapfly.io/docs/screenshot-api/billing)
- [Errors](https://scrapfly.io/docs/screenshot-api/errors)

#### Extraction API

- [Getting Started](https://scrapfly.io/docs/extraction-api/getting-started)
- [API Specification]()
- [Rules Template](https://scrapfly.io/docs/extraction-api/rules-and-template)
- [LLM Extraction](https://scrapfly.io/docs/extraction-api/llm-prompt)
- [AI Auto Extraction](https://scrapfly.io/docs/extraction-api/automatic-ai)
- [Webhook](https://scrapfly.io/docs/extraction-api/webhook)
- [Billing](https://scrapfly.io/docs/extraction-api/billing)
- [Errors](https://scrapfly.io/docs/extraction-api/errors)
- [FAQ](https://scrapfly.io/docs/extraction-api/faq)

#### Proxy Saver

- [Getting Started](https://scrapfly.io/docs/proxy-saver/getting-started)
- [Fingerprints](https://scrapfly.io/docs/proxy-saver/fingerprints)
- [Optimizations](https://scrapfly.io/docs/proxy-saver/optimizations)
- [SSL Certificates](https://scrapfly.io/docs/proxy-saver/certificates)
- [Protocols](https://scrapfly.io/docs/proxy-saver/protocols)
- [Pacfile](https://scrapfly.io/docs/proxy-saver/pacfile)
- [Secure Credentials](https://scrapfly.io/docs/proxy-saver/security)
- [Billing](https://scrapfly.io/docs/proxy-saver/billing)

#### Cloud Browser API

- [Getting Started](https://scrapfly.io/docs/cloud-browser-api/getting-started)
- [Proxy & Geo-Targeting](https://scrapfly.io/docs/cloud-browser-api/proxy)
- [Unblock API](https://scrapfly.io/docs/cloud-browser-api/unblock)
- [File Downloads](https://scrapfly.io/docs/cloud-browser-api/file-downloads)
- [Session Resume](https://scrapfly.io/docs/cloud-browser-api/session-resume)
- [Human-in-the-Loop](https://scrapfly.io/docs/cloud-browser-api/human-in-the-loop)
- [Debug Mode](https://scrapfly.io/docs/cloud-browser-api/debug-mode)
- [Bring Your Own Proxy](https://scrapfly.io/docs/cloud-browser-api/bring-your-own-proxy)
- [Browser Extensions](https://scrapfly.io/docs/cloud-browser-api/extensions)
##### Integrations

- [Puppeteer](https://scrapfly.io/docs/cloud-browser-api/puppeteer)
- [Playwright](https://scrapfly.io/docs/cloud-browser-api/playwright)
- [Selenium](https://scrapfly.io/docs/cloud-browser-api/selenium)
- [Vercel Agent Browser](https://scrapfly.io/docs/cloud-browser-api/agent-browser)
- [Browser Use](https://scrapfly.io/docs/cloud-browser-api/browser-use)
- [Stagehand](https://scrapfly.io/docs/cloud-browser-api/stagehand)

- [Billing](https://scrapfly.io/docs/cloud-browser-api/billing)
- [Errors](https://scrapfly.io/docs/cloud-browser-api/errors)


### Tools

- [Antibot Detector](https://scrapfly.io/docs/tools/antibot-detector)

### SDK

- [Golang](https://scrapfly.io/docs/sdk/golang)
- [Python](https://scrapfly.io/docs/sdk/python)
- [Rust](https://scrapfly.io/docs/sdk/rust)
- [TypeScript](https://scrapfly.io/docs/sdk/typescript)
- [Scrapy](https://scrapfly.io/docs/sdk/scrapy)

### Integrations

- [Getting Started](https://scrapfly.io/docs/integration/getting-started)
- [LangChain](https://scrapfly.io/docs/integration/langchain)
- [LlamaIndex](https://scrapfly.io/docs/integration/llamaindex)
- [CrewAI](https://scrapfly.io/docs/integration/crewai)
- [Zapier](https://scrapfly.io/docs/integration/zapier)
- [Make](https://scrapfly.io/docs/integration/make)
- [n8n](https://scrapfly.io/docs/integration/n8n)

### Academy

- [Overview](https://scrapfly.io/academy)
- [Web Scraping Overview](https://scrapfly.io/academy/scraping-overview)
- [Tools](https://scrapfly.io/academy/tools-overview)
- [Reverse Engineering](https://scrapfly.io/academy/reverse-engineering)
- [Static Scraping](https://scrapfly.io/academy/static-scraping)
- [HTML Parsing](https://scrapfly.io/academy/html-parsing)
- [Dynamic Scraping](https://scrapfly.io/academy/dynamic-scraping)
- [Hidden API Scraping](https://scrapfly.io/academy/hidden-api-scraping)
- [Headless Browsers](https://scrapfly.io/academy/headless-browsers)
- [Hidden Web Data](https://scrapfly.io/academy/hidden-web-data)
- [JSON Parsing](https://scrapfly.io/academy/json-parsing)
- [Data Processing](https://scrapfly.io/academy/data-processing)
- [Scaling](https://scrapfly.io/academy/scaling)
- [Walkthrough Summary](https://scrapfly.io/academy/walkthrough-summary)
- [Scraper Blocking](https://scrapfly.io/academy/scraper-blocking)
- [Proxies](https://scrapfly.io/academy/proxies)

---

# Puppeteer Integration

 [  View as markdown ](https://scrapfly.io/?view=markdown)   Copy for LLM    Copy for LLM  [     Open in ChatGPT ](https://chatgpt.com/?hints=search&prompt=Read%20from%20https%3A%2F%2Fscrapfly.io%2Fdocs%2Fcloud-browser-api%2Fpuppeteer%20so%20I%20can%20ask%20questions%20about%20it.) [     Open in Claude ](https://claude.ai/new?q=Read%20from%20https%3A%2F%2Fscrapfly.io%2Fdocs%2Fcloud-browser-api%2Fpuppeteer%20so%20I%20can%20ask%20questions%20about%20it.) [     Open in Perplexity ](https://www.perplexity.ai/search/new?q=Read%20from%20https%3A%2F%2Fscrapfly.io%2Fdocs%2Fcloud-browser-api%2Fpuppeteer%20so%20I%20can%20ask%20questions%20about%20it.) 

 

 

 1. [Cloud Browser](https://scrapfly.io/docs/cloud-browser-api/getting-started)
2. Puppeteer
 
  [Puppeteer](https://pptr.dev/) is Google's official Node.js library for controlling Chrome. Connect it to Scrapfly Cloud Browser for scalable automation with built-in proxies and fingerprinting.

  **Beta Feature:** Cloud Browser is currently in beta. 

## Installation

Install `puppeteer-core` (the browser-agnostic version):

 ```
npm install puppeteer-core
```

 

   

 

> Use `puppeteer-core` instead of `puppeteer` since Cloud Browser provides the browser instance.

## Quick Start

 ```
const puppeteer = require('puppeteer-core');

const API_KEY = '';
const BROWSER_WS = `wss://browser.scrapfly.io?api_key=${API_KEY}&proxy_pool=public_datacenter_pool&os=linux`;

async function run() {
    // Connect to Cloud Browser
    const browser = await puppeteer.connect({
        browserWSEndpoint: BROWSER_WS,
    });

    const page = await browser.newPage();

    // Navigate and interact
    await page.goto('https://web-scraping.dev');
    const title = await page.title();
    console.log('Page title:', title);

    // Take a screenshot
    await page.screenshot({ path: 'screenshot.png' });

    await browser.close();
}

run();
```

 

   

 

## Connection Parameters

Configure your Cloud Browser connection with these WebSocket URL parameters:

 | Parameter | Required | Default | Description |
|---|---|---|---|
| `api_key` | Yes | - | Your Scrapfly API key for authentication |
| `proxy_pool` | No | `datacenter` | Proxy network type: `datacenter` or `residential` |
| `os` | No | random | Operating system fingerprint: `linux`, `windows`, or `macos` |
| `browser_brand` | No | `chrome` | Chromium-based browser brand used for fingerprint generation. Valid values: `chrome`, `edge`, `brave`, `opera`. Invalid values are silently dropped and the default applies. |
| `session` | No | - | Optional session identifier for maintaining browser state across connections. See [Session Resume](https://scrapfly.io/docs/cloud-browser-api/session-resume). |
| `country` | No | - | Proxy country code (ISO 3166-1 alpha-2), e.g., `us`, `uk`, `de` |
| `auto_close` | No | `true` | Automatically stop the browser session when the CDP connection disconnects. Set to `false` to keep the browser alive for reconnection. |
| `timeout` | No | `900` | Maximum session duration in seconds (15 minutes default, 30 minutes max). |
| `debug` | No | `false` | Enable session recording for debugging. See [Debug Mode](https://scrapfly.io/docs/cloud-browser-api/debug-mode). |
| `block_images` | No | `false` | Stub image requests with a transparent 1x1 pixel. Reduces bandwidth while remaining invisible to anti-bot systems. |
| `block_styles` | No | `false` | Stub stylesheet requests with an empty CSS response. |
| `block_fonts` | No | `false` | Stub font requests with an empty response. |
| `block_media` | No | `false` | Stub video and audio media requests. |
| `blacklist` | No | `false` | Stub known analytics, tracking, and telemetry URLs. |
| `cache` | No | `false` | Enable HTTP cache for static resources. Cached bandwidth billed at 1 credit/MB. |

  **Stubbing vs Blocking:** Resources are **stubbed**, not blocked — the browser receives a valid but empty response (e.g. a transparent 1x1 pixel for images). This saves bandwidth while remaining invisible to anti-bot systems that detect blocked requests. 

## Data Extraction

Extract data from a dynamic page:

 ```
const puppeteer = require('puppeteer-core');

const API_KEY = '';
const BROWSER_WS = `wss://browser.scrapfly.io?api_key=${API_KEY}&proxy_pool=public_datacenter_pool`;

async function scrapeProducts() {
    let browser = null;

    try {
        browser = await puppeteer.connect({
            browserWSEndpoint: BROWSER_WS,
        });

        const page = await browser.newPage();

        // Navigate to the page
        await page.goto('https://web-scraping.dev/products', {
            waitUntil: 'networkidle2',
        });

        // Extract product data
        const products = await page.evaluate(() => {
            return Array.from(document.querySelectorAll('.product')).map(el => ({
                title: el.querySelector('.product-title')?.textContent?.trim(),
                price: el.querySelector('.product-price')?.textContent?.trim(),
                url: el.querySelector('a')?.href,
            }));
        });

        console.log('Products:', products);
        return products;

    } finally {
        if (browser) {
            await browser.close();
        }
    }
}

scrapeProducts();
```

 

   

 

## Form Interaction

Fill forms and handle login flows:

 ```
const puppeteer = require('puppeteer-core');

const API_KEY = '';
const BROWSER_WS = `wss://browser.scrapfly.io?api_key=${API_KEY}&proxy_pool=public_datacenter_pool`;

async function login() {
    let browser = null;

    try {
        browser = await puppeteer.connect({
            browserWSEndpoint: BROWSER_WS,
        });

        const page = await browser.newPage();
        await page.goto('https://web-scraping.dev/login');

        // Fill the login form
        await page.locator('#username').fill('myuser');
        await page.locator('#password').fill('mypassword');

        // Click submit and wait for navigation
        await page.locator('#submit-button').click();
        await page.waitForNavigation();

        // Check if login was successful
        const isLoggedIn = await page.$('.user-profile') !== null;
        console.log('Login successful:', isLoggedIn);

    } finally {
        if (browser) {
            await browser.close();
        }
    }
}

login();
```

 

   

 

## Session Persistence

Maintain browser state across connections using the `session` parameter:

 ```
const puppeteer = require('puppeteer-core');

const API_KEY = '';
const SESSION_ID = 'my-persistent-session';

// First connection: Login and set cookies
async function firstConnection() {
    let browser = null;

    try {
        browser = await puppeteer.connect({
            browserWSEndpoint: `wss://browser.scrapfly.io?api_key=${API_KEY}&session=${SESSION_ID}`,
        });

        const page = await browser.newPage();
        await page.goto('https://web-scraping.dev/login');
        // ... perform login ...

    } finally {
        if (browser) {
            await browser.close();  // Session is preserved
        }
    }
}

// Second connection: Reuse the logged-in session
async function secondConnection() {
    let browser = null;

    try {
        browser = await puppeteer.connect({
            browserWSEndpoint: `wss://browser.scrapfly.io?api_key=${API_KEY}&session=${SESSION_ID}`,
        });

        const page = await browser.newPage();
        await page.goto('https://web-scraping.dev/dashboard');
        // Already logged in from previous session!

    } finally {
        if (browser) {
            await browser.close();
        }
    }
}
```

 

   

 

## Proxy Options

 | Proxy Pool | Use Case | Cost |
|---|---|---|
| `datacenter` | General scraping, high speed, lower cost | 1 credits/30s + 7 credits/MB |
| `residential` | Protected sites, geo-targeting, anti-bot bypass | 1 credits/30s + 52 credits/MB |

## Best Practices

- **Use `puppeteer-core`** - Don't download bundled Chrome
- **Handle disconnects** - Wrap connections in try/catch/finally blocks
- **Close browsers** - Always call `browser.close()` to stop billing (use finally block)
- **Use sessions wisely** - Reuse sessions for multi-step flows
- **Block unnecessary resources** - Use request interception to reduce bandwidth
- **Set timeouts** - Add reasonable timeouts to prevent hanging connections (e.g., `timeout: 30000` in `page.goto()`)
- **Error handling** - Catch specific errors like timeout or WebSocket connection failures for better debugging
 
## Related

- [Cloud Browser Getting Started](https://scrapfly.io/docs/cloud-browser-api/getting-started)
- [Billing &amp; Pricing](https://scrapfly.io/docs/cloud-browser-api/billing)
- [Playwright Integration](https://scrapfly.io/docs/cloud-browser-api/playwright)
- [Puppeteer Documentation](https://pptr.dev/)