# Scrapfly Documentation

## Table of Contents

### Dashboard

- [Intro](https://scrapfly.io/docs)
- [Project](https://scrapfly.io/docs/project)
- [Account](https://scrapfly.io/docs/account)
- [Workspace & Team](https://scrapfly.io/docs/workspace-and-team)
- [Billing](https://scrapfly.io/docs/billing)

### Products

#### MCP Server

- [Getting Started](https://scrapfly.io/docs/mcp/getting-started)
- [Tools & API Spec](https://scrapfly.io/docs/mcp/tools)
- [Authentication](https://scrapfly.io/docs/mcp/authentication)
- [Examples & Use Cases](https://scrapfly.io/docs/mcp/examples)
- [FAQ](https://scrapfly.io/docs/mcp/faq)
##### Integrations

- [Overview](https://scrapfly.io/docs/mcp/integrations)
- [Claude Desktop](https://scrapfly.io/docs/mcp/integrations/claude-desktop)
- [Claude Code](https://scrapfly.io/docs/mcp/integrations/claude-code)
- [ChatGPT](https://scrapfly.io/docs/mcp/integrations/chatgpt)
- [Cursor](https://scrapfly.io/docs/mcp/integrations/cursor)
- [Cline](https://scrapfly.io/docs/mcp/integrations/cline)
- [Windsurf](https://scrapfly.io/docs/mcp/integrations/windsurf)
- [Zed](https://scrapfly.io/docs/mcp/integrations/zed)
- [Roo Code](https://scrapfly.io/docs/mcp/integrations/roo-code)
- [VS Code](https://scrapfly.io/docs/mcp/integrations/vscode)
- [LangChain](https://scrapfly.io/docs/mcp/integrations/langchain)
- [LlamaIndex](https://scrapfly.io/docs/mcp/integrations/llamaindex)
- [CrewAI](https://scrapfly.io/docs/mcp/integrations/crewai)
- [OpenAI](https://scrapfly.io/docs/mcp/integrations/openai)
- [n8n](https://scrapfly.io/docs/mcp/integrations/n8n)
- [Make](https://scrapfly.io/docs/mcp/integrations/make)
- [Zapier](https://scrapfly.io/docs/mcp/integrations/zapier)
- [Vapi AI](https://scrapfly.io/docs/mcp/integrations/vapi)
- [Agent Builder](https://scrapfly.io/docs/mcp/integrations/agent-builder)
- [Custom Client](https://scrapfly.io/docs/mcp/integrations/custom-client)


#### Web Scraping API

- [Getting Started](https://scrapfly.io/docs/scrape-api/getting-started)
- [API Specification]()
- [Monitoring](https://scrapfly.io/docs/monitoring)
- [Customize Request](https://scrapfly.io/docs/scrape-api/custom)
- [Debug](https://scrapfly.io/docs/scrape-api/debug)
- [Anti Scraping Protection](https://scrapfly.io/docs/scrape-api/anti-scraping-protection)
- [Proxy](https://scrapfly.io/docs/scrape-api/proxy)
- [Proxy Mode](https://scrapfly.io/docs/scrape-api/proxy-mode)
- [Proxy Mode - Screaming Frog](https://scrapfly.io/docs/scrape-api/proxy-mode/screaming-frog)
- [Proxy Mode - Apify](https://scrapfly.io/docs/scrape-api/proxy-mode/apify)
- [(Auto) Data Extraction](https://scrapfly.io/docs/scrape-api/extraction)
- [Javascript Rendering](https://scrapfly.io/docs/scrape-api/javascript-rendering)
- [Javascript Scenario](https://scrapfly.io/docs/scrape-api/javascript-scenario)
- [SSL](https://scrapfly.io/docs/scrape-api/ssl)
- [DNS](https://scrapfly.io/docs/scrape-api/dns)
- [Cache](https://scrapfly.io/docs/scrape-api/cache)
- [Batch (Multi-URL Scraping)](https://scrapfly.io/docs/scrape-api/batch)
- [Session](https://scrapfly.io/docs/scrape-api/session)
- [Webhook](https://scrapfly.io/docs/scrape-api/webhook)
- [Schedule](https://scrapfly.io/docs/scrape-api/schedule)
- [Screenshot](https://scrapfly.io/docs/scrape-api/screenshot)
- [Errors](https://scrapfly.io/docs/scrape-api/errors)
- [Timeout](https://scrapfly.io/docs/scrape-api/understand-timeout)
- [Throttling](https://scrapfly.io/docs/throttling)
- [Troubleshoot](https://scrapfly.io/docs/scrape-api/troubleshoot)
- [Billing](https://scrapfly.io/docs/scrape-api/billing)
- [FAQ](https://scrapfly.io/docs/scrape-api/faq)

#### Crawler API

- [Getting Started](https://scrapfly.io/docs/crawler-api/getting-started)
- [API Specification]()
- [Retrieving Results](https://scrapfly.io/docs/crawler-api/results)
- [WARC Format](https://scrapfly.io/docs/crawler-api/warc-format)
- [Data Extraction](https://scrapfly.io/docs/crawler-api/extraction-rules)
- [Webhook](https://scrapfly.io/docs/crawler-api/webhook)
- [Schedule](https://scrapfly.io/docs/crawler-api/schedule)
- [Billing](https://scrapfly.io/docs/crawler-api/billing)
- [Errors](https://scrapfly.io/docs/crawler-api/errors)
- [Troubleshoot](https://scrapfly.io/docs/crawler-api/troubleshoot)
- [FAQ](https://scrapfly.io/docs/crawler-api/faq)

#### Screenshot API

- [Getting Started](https://scrapfly.io/docs/screenshot-api/getting-started)
- [API Specification]()
- [Accessibility Testing](https://scrapfly.io/docs/screenshot-api/accessibility)
- [Webhook](https://scrapfly.io/docs/screenshot-api/webhook)
- [Schedule](https://scrapfly.io/docs/screenshot-api/schedule)
- [Billing](https://scrapfly.io/docs/screenshot-api/billing)
- [Errors](https://scrapfly.io/docs/screenshot-api/errors)

#### Extraction API

- [Getting Started](https://scrapfly.io/docs/extraction-api/getting-started)
- [API Specification]()
- [Rules Template](https://scrapfly.io/docs/extraction-api/rules-and-template)
- [Saved Templates](https://scrapfly.io/docs/extraction-api/templates)
- [LLM Extraction](https://scrapfly.io/docs/extraction-api/llm-prompt)
- [AI Auto Extraction](https://scrapfly.io/docs/extraction-api/automatic-ai)
- [Webhook](https://scrapfly.io/docs/extraction-api/webhook)
- [Billing](https://scrapfly.io/docs/extraction-api/billing)
- [Errors](https://scrapfly.io/docs/extraction-api/errors)
- [FAQ](https://scrapfly.io/docs/extraction-api/faq)

#### Data API

- [Getting Started](https://scrapfly.io/docs/data-api/getting-started)

#### Proxy Saver

- [Getting Started](https://scrapfly.io/docs/proxy-saver/getting-started)
- [Fingerprints](https://scrapfly.io/docs/proxy-saver/fingerprints)
- [Optimizations](https://scrapfly.io/docs/proxy-saver/optimizations)
- [SSL Certificates](https://scrapfly.io/docs/proxy-saver/certificates)
- [Protocols](https://scrapfly.io/docs/proxy-saver/protocols)
- [Pacfile](https://scrapfly.io/docs/proxy-saver/pacfile)
- [Secure Credentials](https://scrapfly.io/docs/proxy-saver/security)
- [Billing](https://scrapfly.io/docs/proxy-saver/billing)

#### Cloud Browser API

- [Getting Started](https://scrapfly.io/docs/cloud-browser-api/getting-started)
- [Proxy & Geo-Targeting](https://scrapfly.io/docs/cloud-browser-api/proxy)
- [Unblock API](https://scrapfly.io/docs/cloud-browser-api/unblock)
- [Captcha Solver](https://scrapfly.io/docs/cloud-browser-api/captcha-solver)
- [File Downloads](https://scrapfly.io/docs/cloud-browser-api/file-downloads)
- [Session Resume](https://scrapfly.io/docs/cloud-browser-api/session-resume)
- [Human-in-the-Loop](https://scrapfly.io/docs/cloud-browser-api/human-in-the-loop)
- [Debug Mode](https://scrapfly.io/docs/cloud-browser-api/debug-mode)
- [Bring Your Own Proxy](https://scrapfly.io/docs/cloud-browser-api/bring-your-own-proxy)
- [Browser Extensions](https://scrapfly.io/docs/cloud-browser-api/extensions)
- [Native Browser MCP](https://scrapfly.io/docs/cloud-browser-api/mcp)
- [DevTools Protocol](https://scrapfly.io/docs/cloud-browser-api/cdp-reference)
##### Integrations

- [Puppeteer](https://scrapfly.io/docs/cloud-browser-api/puppeteer)
- [Playwright](https://scrapfly.io/docs/cloud-browser-api/playwright)
- [Selenium](https://scrapfly.io/docs/cloud-browser-api/selenium)
- [Vercel Agent Browser](https://scrapfly.io/docs/cloud-browser-api/agent-browser)
- [Browser Use](https://scrapfly.io/docs/cloud-browser-api/browser-use)
- [Stagehand](https://scrapfly.io/docs/cloud-browser-api/stagehand)

- [Billing](https://scrapfly.io/docs/cloud-browser-api/billing)
- [Errors](https://scrapfly.io/docs/cloud-browser-api/errors)


### Tools

- [Antibot Detector](https://scrapfly.io/docs/tools/antibot-detector)

### SDK

- [Golang](https://scrapfly.io/docs/sdk/golang)
- [Python](https://scrapfly.io/docs/sdk/python)
- [Rust](https://scrapfly.io/docs/sdk/rust)
- [TypeScript](https://scrapfly.io/docs/sdk/typescript)
- [Scrapy](https://scrapfly.io/docs/sdk/scrapy)

### Integrations

- [Getting Started](https://scrapfly.io/docs/integration/getting-started)
- [LangChain](https://scrapfly.io/docs/integration/langchain)
- [LlamaIndex](https://scrapfly.io/docs/integration/llamaindex)
- [CrewAI](https://scrapfly.io/docs/integration/crewai)
- [Zapier](https://scrapfly.io/docs/integration/zapier)
- [Make](https://scrapfly.io/docs/integration/make)
- [n8n](https://scrapfly.io/docs/integration/n8n)

### Academy

- [Overview](https://scrapfly.io/academy)
- [Web Scraping Overview](https://scrapfly.io/academy/scraping-overview)
- [Tools](https://scrapfly.io/academy/tools-overview)
- [Reverse Engineering](https://scrapfly.io/academy/reverse-engineering)
- [Static Scraping](https://scrapfly.io/academy/static-scraping)
- [HTML Parsing](https://scrapfly.io/academy/html-parsing)
- [Dynamic Scraping](https://scrapfly.io/academy/dynamic-scraping)
- [Hidden API Scraping](https://scrapfly.io/academy/hidden-api-scraping)
- [Headless Browsers](https://scrapfly.io/academy/headless-browsers)
- [Hidden Web Data](https://scrapfly.io/academy/hidden-web-data)
- [JSON Parsing](https://scrapfly.io/academy/json-parsing)
- [Data Processing](https://scrapfly.io/academy/data-processing)
- [Scaling](https://scrapfly.io/academy/scaling)
- [Walkthrough Summary](https://scrapfly.io/academy/walkthrough-summary)
- [Scraper Blocking](https://scrapfly.io/academy/scraper-blocking)
- [Proxies](https://scrapfly.io/academy/proxies)

---

# Webhook

 Scrapfly's [webhook](https://scrapfly.io/docs/crawler-api/getting-started#webhook_name) feature is ideal for managing crawler jobs asynchronously. When webhook is specified through the `webhook_name` parameter, Scrapfly will notify your HTTP endpoint about crawl events in real-time, eliminating the need for polling.

 To start using webhooks, first one must be created using the [webhook web interface](https://scrapfly.io/dashboard/webhook).

 webhook management page  The webhook will be called for each event you subscribe to during the crawl lifecycle. For reconciliation, you will receive the `crawler_uuid` and `webhook_uuid` in the [response headers](#headers).

 webhook status report on monitoring log page > **Webhook Queue Size** The webhook queue size indicates the maximum number of queued webhooks that can be scheduled. After the crawler event is processed and your application is notified, the queue size is reduced. This allows you to schedule additional crawler jobs beyond the concurrency limit of your subscription. The scheduler will handle this and ensure that your concurrency limit is met.
>
>  | ###### FREE   $0.00/mo | ###### DISCOVERY   $30.00/mo | ###### PRO   $100.00/mo | ###### STARTUP   $250.00/mo | ###### ENTERPRISE   $500.00/mo |
> |---|---|---|---|---|
> | 0 | 500 | 2,000 | 5,000 | 10,000 |

 [See in Your Dashboard](https://scrapfly.io/dashboard/webhook)

## Scope

 Webhooks are scoped per Scrapfly [projects](https://scrapfly.io/docs/project) and environments. Make sure to create a webhook for each of your projects and environments (test/live).

## Usage

> Webhooks can be used for multiple purposes. In the context of the Crawler API, to ensure you received a crawler event, you must check the header `X-Scrapfly-Webhook-Resource-Type` and verify the value is `crawler`.

 To enable webhook callbacks, specify the `webhook_name` parameter in your crawler requests and optionally provide a list of `webhook_events` you want to be notified about. Scrapfly will then call your webhook endpoint as crawl events occur.

 Note that your webhook endpoint must respond with a `2xx` status code for the webhook to be considered successful. The `3xx` redirect responses will be followed, and response codes `4xx` and `5xx` are considered failures and will be retried as per the retry policy.

> The below examples assume you have a webhook named **my-crawler-webhook** registered. You can create webhooks via the [web dashboard](https://scrapfly.io/dashboard/webhook).

## Webhook Events &amp; Payloads

 The Crawler API supports multiple webhook events that notify you about different stages of the crawl lifecycle. Each event sends a JSON payload with the crawler state and event-specific data.

> **Default Subscription** If you don't specify `webhook_events`, you'll receive: `crawler_started`, `crawler_stopped`, `crawler_cancelled`, and `crawler_finished`.

### HTTP Headers

 Every webhook request includes these HTTP headers for easy routing and verification:

 | Header | Purpose | Example Value |
|---|---|---|
| `X-Scrapfly-Crawl-Event-Name` | **Fast routing** - Use this to route events without parsing JSON | `crawler_started` |
| `X-Scrapfly-Webhook-Resource-Type` | Resource type (always `crawler` for crawler webhooks) | `crawler` |
| `X-Scrapfly-Webhook-Job-Id` | Crawler UUID for tracking and reconciliation | `550e8400-e29b...` |
| `X-Scrapfly-Webhook-Signature` | HMAC-SHA256 signature for verification, uppercase | `A3F2B1C...` |
| `X-Scrapfly-Webhook-Signature-Lowercase` | Same signature, lowercase. Provided because some platforms and managed webhook services (Hookdeck, AWS Lambda function URLs, certain edge runtimes) normalise header values to lowercase, which would otherwise break a strict string equality check against the uppercase variant. Either header is acceptable, pick the one that matches your runtime. | `a3f2b1c...` |

> **Performance Tip** Route webhook events using the `X-Scrapfly-Crawl-Event-Name` header instead of parsing the JSON body. This is significantly faster for high-frequency events like `crawler_url_visited`.

### Event Types &amp; Examples

 Click each tab below to see the event description and full JSON payload example:

   crawler\_started   crawler\_url\_visited High Freq   crawler\_url\_failed   crawler\_url\_skipped   crawler\_url\_discovered High Freq   crawler\_finished   crawler\_stopped   crawler\_cancelled

 #####  crawler\_started

**When:** Crawler execution begins

**Use case:** Track when crawls start, log crawler UUID, initialize tracking systems

**Frequency:** Once per crawl

> **Key Fields:** `crawler_uuid`, `seed_url`, `links.status`

 ```
{
    "event": "crawler_started",
    "payload": {
        "crawler_uuid": "60cf1121-9de4-43fc-a0c6-7dda1721a65b",
        "project": "default",
        "env": "LIVE",
        "seed_url": "https://web-scraping.dev/products",
        "action": "started",
        "state": {
            "duration": 1,
            "urls_visited": 0,
            "urls_extracted": 0,
            "urls_failed": 0,
            "urls_skipped": 0,
            "urls_to_crawl": 0,
            "api_credit_used": 0,
            "stop_reason": null,
            "start_time": 1762939798,
            "stop_time": 1762939799
        },
        "links": {
            "status": "https://api.scrapfly.io/crawl/60cf1121-9de4-43fc-a0c6-7dda1721a65b/status"
        }
    }
}

```

#####  crawler\_url\_visited

**When:** Each URL is successfully crawled

**Use case:** Real-time progress tracking, streaming results, monitoring performance

**Frequency:** High - Fires for every successfully crawled URL (can be thousands per crawl)

> **Performance Warning:** Your endpoint must handle high throughput. Use `X-Scrapfly-Crawl-Event-Name` header for fast routing without parsing JSON body.

 ```
{
    "event": "crawler_url_visited",
    "payload": {
        "crawler_uuid": "60cf1121-9de4-43fc-a0c6-7dda1721a65b",
        "project": "default",
        "env": "LIVE",
        "url": "https://web-scraping.dev/products",
        "action": "visited",
        "state": {
            "duration": 1,
            "urls_visited": 0,
            "urls_extracted": 0,
            "urls_failed": 0,
            "urls_skipped": 0,
            "urls_to_crawl": 0,
            "api_credit_used": 1,
            "stop_reason": null,
            "start_time": 1762939798,
            "stop_time": 1762939799
        },
        "scrape": {
            "status_code": 200,
            "country": "de",
            "log_uuid": "01K9VPD22494F0ZEX7DGEZQ4ES",
            "log_url": "https://scrapfly.io/dashboard/monitoring/log/01K9VPD22494F0ZEX7DGEZQ4ES",
            "content": {
                "html": "[...]",
                "text": "[...]"
                ...
            }
        }
    }
}

```

#####  crawler\_url\_failed

**When:** A URL fails to crawl (network error, timeout, block, etc.)

**Use case:** Error monitoring, retry logic, debugging failed scrapes

**Frequency:** Per failed URL

> **Debugging Features:**- `error` - Error code for classification
> - `links.log` - Direct link to scrape log for debugging
> - `scrape_config` - Complete configuration to replay the scrape
> - `links.scrape` - Ready-to-use retry URL with same configuration

 ```
{
    "event": "crawler_url_failed",
    "payload": {
        "state": {
            "duration": 3,
            "urls_visited": 0,
            "urls_extracted": 0,
            "urls_failed": 0,
            "urls_skipped": 0,
            "urls_to_crawl": 0,
            "api_credit_used": 0,
            "stop_reason": null,
            "start_time": 1762944028,
            "stop_time": 1762944031
        },
        "action": "failed",
        "crawler_uuid": "5caa5439-03a4-4c74-9a4c-0597e190dd72",
        "project": "default",
        "env": "LIVE",
        "url": "https://web-scraping.dev/products",
        "error": "ERR::SCRAPE::NETWORK_ERROR",
        "scrape_config": {
            "method": "GET",
            "url": "https://web-scraping.dev/products",
            "body": null,
            "project": "default",
            "env": "LIVE",
            "render_js": false,
            "rendering_timeout": 0,
            "asp": false,
            "proxy_pool": null,
            "country": "de",
            "headers": {},
            "format": "raw",
            "retry": true,
            "correlation_id": "5caa5439-03a4-4c74-9a4c-0597e190dd72",
            "tags": [
                "crawler"
            ],
            "wait_for_selector": null,
            "cache": false,
            "cache_ttl": 86400,
            "cache_clear": false,
            "geolocation": null,
            "screenshot_api_cost": 60,
            "screenshot_flags": null,
            "format_options": [],
            "auto_scroll": false,
            "js_scenario": null,
            "screenshots": {},
            "lang": null,
            "os": null,
            "js": null,
            "rendering_stage": "complete",
            "extraction_prompt": null,
            "extraction_model": null,
            "extraction_model_custom_schema": null,
            "extraction_template": null
        },
        "links": {
            "log": "https://api.scrapfly.io/crawl/5caa5439-03a4-4c74-9a4c-0597e190dd72/logs?url=https://web-scraping.dev/products",
            "scrape": "https://api.scrapfly.io/scrape?url=https%3A%2F%2Fweb-scraping.dev%2Fproducts&key=YOUR_KEY"
        }
    }
}

```

#####  crawler\_url\_skipped

**When:** URLs are skipped (already visited, filtered, depth limit, etc.)

**Use case:** Monitor filtering effectiveness, track duplicate discovery

**Frequency:** Per batch of skipped URLs

> **Key Fields:** `urls` contains a map of each skipped URL to its skip reason

 ```
{
    "event": "crawler_url_skipped",
    "payload": {
        "state": {
            "duration": 2,
            "urls_visited": 1,
            "urls_extracted": 22,
            "urls_failed": 0,
            "urls_skipped": 21,
            "urls_to_crawl": 1,
            "api_credit_used": 3,
            "stop_reason": "page_limit",
            "start_time": 1762940028,
            "stop_time": 1762940030
        },
        "action": "skipped",
        "crawler_uuid": "b4867c50-318c-47cd-bfc9-bed67f24771a",
        "project": "default",
        "env": "LIVE",
        "urls": {
            "https://web-scraping.dev/product/2?variant=one": "page_limit",
            "https://web-scraping.dev/product/25": "page_limit",
            "https://web-scraping.dev/product/15": "page_limit",
            "https://web-scraping.dev/product/9": "page_limit",
            "https://web-scraping.dev/product/2?variant=six-pack": "page_limit"
        }
    }
}

```

#####  crawler\_url\_discovered

**When:** New URLs are discovered from crawled pages

**Use case:** Track crawl expansion, monitor discovery patterns, sitemap building

**Frequency:** High - Fires for each batch of discovered URLs

> **Key Fields:** `origin` (source URL where links were found), `discovered_urls` (list of new URLs)

 ```
{
    "event": "crawler_url_discovered",
    "payload": {
        "state": {
            "duration": 3,
            "urls_visited": 0,
            "urls_extracted": 0,
            "urls_failed": 0,
            "urls_skipped": 0,
            "urls_to_crawl": 0,
            "api_credit_used": 1,
            "stop_reason": null,
            "start_time": 1762940138,
            "stop_time": 1762940141
        },
        "action": "url_discovery",
        "crawler_uuid": "92e97a67-a962-4dcd-9b3e-261e4d4cb6f5",
        "project": "default",
        "env": "LIVE",
        "origin": "navigation",
        "discovered_urls": [
            "https://web-scraping.dev/product/5",
            "https://web-scraping.dev/product/1",
            "https://web-scraping.dev/product/3",
            "https://web-scraping.dev/product/4",
            "https://web-scraping.dev/product/2"
        ]
    }
}

```

#####  crawler\_finished

**When:** Crawler completes successfully (at least one URL visited)

**Use case:** Trigger post-processing, download results, send completion notifications

**Frequency:** Once per successful crawl

> **Success Indicators:** `state.urls_visited` &gt; 0 confirms at least one URL was crawled. Check `state.stop_reason` to understand why the crawler completed (e.g., `no_more_urls`, `page_limit`).

 ```
{
    "event": "crawler_finished",
    "payload": {
        "crawler_uuid": "b4867c50-318c-47cd-bfc9-bed67f24771a",
        "project": "default",
        "env": "LIVE",
        "seed_url": "https://web-scraping.dev/products",
        "action": "finished",
        "state": {
            "duration": 6.11,
            "urls_visited": 5,
            "urls_extracted": 49,
            "urls_failed": 0,
            "urls_skipped": 44,
            "urls_to_crawl": 5,
            "api_credit_used": 5,
            "stop_reason": "page_limit",
            "start_time": 1762940028,
            "stop_time": 1762940034.1143808
        },
        "links": {
            "status": "https://api.scrapfly.io/crawl/b4867c50-318c-47cd-bfc9-bed67f24771a/status"
        }
    }
}

```

#####  crawler\_stopped

**When:** Crawler stops due to failure (seed URL failed, errors, no URLs visited)

**Use case:** Error alerting, failure logging, retry automation

**Frequency:** Once per failed crawl

> **Failure Reasons:** Check `state.stop_reason` for the exact cause: - `seed_url_failed` - Initial URL couldn't be crawled
> - `crawler_error` - Internal crawler error occurred
> - `no_api_credit_left` - Account ran out of API credits mid-crawl
> - `max_api_credit` - Configured credit limit reached

 ```
{
    "event": "crawler_stopped",
    "payload": {
        "crawler_uuid": "d1f6f97a-c48d-440f-86ca-b21b254ba12f",
        "project": "default",
        "env": "LIVE",
        "seed_url": "https://web-scraping.dev/products",
        "action": "stopped",
        "state": {
            "duration": 8.53,
            "urls_visited": 0,
            "urls_extracted": 1,
            "urls_failed": 1,
            "urls_skipped": 0,
            "urls_to_crawl": 1,
            "api_credit_used": 0,
            "stop_reason": "seed_url_failed",
            "start_time": 1762951426,
            "stop_time": 1762951434.5287035
        },
        "links": {
            "status": "https://api.scrapfly.home/crawl/d1f6f97a-c48d-440f-86ca-b21b254ba12f/status"
        }
    }
}

```

#####  crawler\_cancelled

**When:** User manually cancels the crawl via API or dashboard

**Use case:** Update tracking systems, release resources, log cancellations

**Frequency:** Once per user cancellation

> **Cancellation State:** `state.stop_reason` will be `user_cancelled`. Partial crawl results are available via the status endpoint and can be retrieved normally.

 ```
{
    "event": "crawler_cancelled",
    "payload": {
        "crawler_uuid": "60cf1121-9de4-43fc-a0c6-7dda1721a65b",
        "project": "default",
        "env": "LIVE",
        "seed_url": "https://web-scraping.dev/products",
        "action": "cancelled",
        "state": {
            "duration": 45,
            "urls_visited": 23,
            "urls_extracted": 87,
            "urls_failed": 2,
            "urls_skipped": 5,
            "urls_to_crawl": 57,
            "api_credit_used": 230,
            "stop_reason": "user_cancelled",
            "start_time": 1762939798,
            "stop_time": 1762939843
        },
        "links": {
            "status": "https://api.scrapfly.io/crawl/60cf1121-9de4-43fc-a0c6-7dda1721a65b/status"
        }
    }
}

```

## Development

 Useful tools for local webhook development:

- <https://webhook.site> - Collect and display webhook notifications
- <https://ngrok.com> - Expose your local application through a secured tunnel to the internet
- <https://console.hookdeck.com> - Inspect, replay, and forward webhooks to your local application (combines the previous two)

## Security

 Webhooks are signed using HMAC (Hash-based Message Authentication Code) with the SHA-256 algorithm to ensure the integrity of the webhook content and verify its authenticity. This mechanism helps prevent tampering and ensures that webhook payloads are from trusted sources.

#### HMAC Overview

 HMAC is a cryptographic technique that combines a secret key with a hash function (in this case, SHA-256) to produce a fixed-size hash value known as the HMAC digest. This digest is unique to both the original message and the secret key, providing a secure way to verify the integrity and authenticity of the message.

#### Signature in HTTP Header

 When Scrapfly sends a webhook notification, it includes an HMAC signature in the `X-Scrapfly-Webhook-Signature` HTTP header. This signature is generated by applying the HMAC-SHA256 algorithm to the entire request body using your webhook's secret key (configured in the webhook settings).

#### Verification Example

 To verify the authenticity of a webhook notification, compute the HMAC-SHA256 signature of the request body using your secret key and compare it with the signature provided in the `X-Scrapfly-Webhook-Signature` header:

- [ Python ](#pane-webhook-hmac-python)
- [ TypeScript ](#pane-webhook-hmac-typescript)
- [ Go ](#pane-webhook-hmac-go)

 ```python
import hmac
import hashlib

secret_key = b"YOUR-WEBHOOK-SIGNING-SECRET"  # copy as-is from the dashboard
webhook_payload = b'{"event": "crawler_finished", ...}'
received_signature = "..."  # X-Scrapfly-Webhook-Signature header (uppercase hex)

computed = hmac.new(secret_key, webhook_payload, hashlib.sha256).hexdigest().upper()
if hmac.compare_digest(computed, received_signature):
    print("Signature OK")
else:
    print("Signature mismatch, reject the request")
```

 ```javascript
// The TypeScript SDK ships a `verifySignature` helper:
import { verifySignature } from 'scrapfly-sdk';

const secret = 'YOUR-WEBHOOK-SIGNING-SECRET'; // copy as-is from the dashboard
const body: Buffer = req.rawBody; // the raw request body bytes
const sig = req.header('X-Scrapfly-Webhook-Signature') ?? '';

if (verifySignature(body, sig, secret)) {
    console.log('Signature OK');
} else {
    console.log('Signature mismatch, reject the request');
}
```

 ```go
import (
    "crypto/hmac"
    "crypto/sha256"
    "encoding/hex"
    "strings"
)

// signingSecret is the raw string copied from the webhook dashboard.
func verify(body []byte, signatureHex, signingSecret string) bool {
    mac := hmac.New(sha256.New, []byte(signingSecret))
    mac.Write(body)
    expected := strings.ToUpper(hex.EncodeToString(mac.Sum(nil)))
    return hmac.Equal([]byte(expected), []byte(strings.ToUpper(signatureHex)))
}
```

> **Security Best Practices**- Always verify the HMAC signature before processing webhook payloads
> - Keep your webhook secret key confidential and rotate it periodically
> - Use HTTPS endpoints for webhook URLs to encrypt data in transit
> - Implement rate limiting on your webhook endpoint to handle high-frequency events

## Best Practices

#### Subscribe only to the events you use

 `crawler_url_visited` and `crawler_url_discovered` fire once per URL, which is thousands of deliveries on a large crawl. Pass an explicit `webhook_events` list and drop the ones you don't consume.

#### Always verify the HMAC signature

 Compute the digest over the raw request body bytes (don't parse and re-serialize JSON, that changes the byte sequence) and compare with [`X-Scrapfly-Webhook-Signature`](#security) using a constant-time comparison. The same digest is also exposed lowercase as `X-Scrapfly-Webhook-Signature-Lowercase` for runtimes that lowercase headers.

#### Filter by resource type

 A single webhook URL can receive deliveries from multiple Scrapfly products. Check `X-Scrapfly-Webhook-Resource-Type` and only process when the value is `crawler`.

#### Route on the event-name header

 Use `X-Scrapfly-Crawl-Event-Name` to dispatch high-frequency events. Don't parse the body just to read the event name.

#### Acknowledge fast, process asynchronously

 Return `2xx` after verifying the signature and persisting the raw payload (queue, log, inbox table). Run database writes, downstream calls, and file downloads on a worker. Slow handlers cause retries.

> **Managed gateway option** A managed webhook gateway such as [Hookdeck](https://hookdeck.com) can sit between Scrapfly and your endpoint to handle buffering, retries, and replay. Useful when your consumer scales to zero or can't sustain peak event rates.

#### Make your handler idempotent

 Retries can deliver the same event twice. Use `X-Scrapfly-Webhook-Job-Id` plus the event name as the dedup key.

#### Use HTTPS

 Land on HTTPS directly. Scrapfly follows `3xx` redirects, but redirecting from HTTPS to HTTP exposes the payload and signature.

#### Watch the auto-disable threshold

 Failures (`4xx`, `5xx`, timeouts) are retried per the retry policy. After 100 consecutive failures the webhook is disabled and has to be re-enabled from the dashboard. Alert on success rate before you hit that.

## Troubleshooting

 Delivery history, payloads, and response codes are in the [webhook dashboard](https://scrapfly.io/dashboard/webhook). Check there first.

#### Not receiving any webhooks

- The webhook may be disabled. After 100 consecutive failures Scrapfly disables it automatically, re-enable from the dashboard.
- The webhook must exist in the same [project and environment](#scope) (test/live) as the request that triggered it.
- If you set `webhook_events` explicitly, the event you expect must be in that list. Default is `crawler_started`, `crawler_stopped`, `crawler_cancelled`, `crawler_finished`.
- `localhost` and private IPs are not reachable. Use a forwarding tool from the [Development](#development) section.

#### Signature verification fails

- Compute over the raw body bytes. Decoding and re-encoding JSON produces a different byte sequence.
- The digest is uppercase hex. Normalise both sides, or use `X-Scrapfly-Webhook-Signature-Lowercase`.
- Use the signing secret exactly as shown in the dashboard. Don't trim or base64-decode it.
- Reverse proxies, CDNs, or WAFs in front of your endpoint can rewrite the body or strip headers. Check there.
- Use a constant-time comparison: `hmac.compare_digest` in Python, `crypto.timingSafeEqual` in Node, `hmac.Equal` in Go. Plain string equality leaks timing and is a frequent source of flaky mismatches.

#### Timeouts and retries

- Acknowledge with `2xx` after persisting the payload, then run heavy work on a worker.
- Cold starts (serverless, scale-to-zero) can blow the latency budget on the first delivery after idle. Keep a warm instance, or expect the first call to be retried.
- Return `2xx` only for actual success. `4xx` and `5xx` trigger retries per the retry policy.

> **Persistent retry storms** If you keep failing the retry budget or approaching the 100-failure auto-disable threshold, a managed gateway like [Hookdeck](https://hookdeck.com) between Scrapfly and your endpoint gives you durable queueing, a configurable retry policy, and dashboard replay. Scrapfly only sees the gateway's `2xx`.

#### Duplicate deliveries

 Retries can deliver the same event twice. Dedup on `X-Scrapfly-Webhook-Job-Id` plus the event name from `X-Scrapfly-Crawl-Event-Name`, and keep writes idempotent. Expected behaviour, not a bug.

#### Wrong event reaching the handler

 If a non-crawler payload (for example a Scrape API event) hits your crawler handler, your URL is serving multiple products. Branch on `X-Scrapfly-Webhook-Resource-Type` and only process `crawler`.

#### Events delayed during a long crawl

 Crawler webhooks queue up to your plan's webhook queue size. If your endpoint is slow or down long enough to fill the queue, deliveries for that crawl can be delayed. Either size your endpoint for the event rate or trim `webhook_events`.

## Next Steps

- Create your first webhook in the [webhook dashboard](https://scrapfly.io/dashboard/webhook)
- Learn about [crawler configuration options](https://scrapfly.io/docs/crawler-api/getting-started)
- Review [error handling](https://scrapfly.io/docs/crawler-api/errors) for webhook failures
- Read the [Best Practices](#best-practices) and [Troubleshooting](#troubleshooting) sections before going to production
