Crawler API Errors

View as markdown

The Crawler API returns standard HTTP status codes and detailed error information to help you troubleshoot issues. This page lists error codes specific to crawler operations and inherited errors from the Web Scraping API.

Note: Crawler API also inherits all error codes from the Web Scraping API since each crawled page is treated as a scrape request.

Crawler-Specific Errors

The Crawler API has specific error codes that are unique to crawler operations:

ERR::CRAWLER::ALREADY_SCHEDULED

The given crawler uuid is already scheduled

Retryable: No
HTTP status code: 422
Documentation:

ERR::CRAWLER::CONFIG_ERROR

Crawler configuration error

Retryable: No
HTTP status code: 400
Documentation:
- Crawler Documentation
- Related Error Doc

ERR::CRAWLER::TIMEOUT

Crawler exceeded time limit

Retryable: No
HTTP status code: 408
Documentation:

Intelligent Error Handling

The Crawler automatically monitors and responds to errors during execution, protecting your crawl budget and preventing wasted API credits. Different error types trigger different automated responses.

Automatic Protection: The Crawler intelligently stops, throttles, or monitors based on error patterns. You don't need to manually handle most error scenarios - the system protects you automatically.

Fatal Errors - Immediate Stop

These errors immediately stop the crawler to prevent unnecessary API credit consumption. When encountered, the crawler terminates gracefully and returns results for URLs already crawled.

Immediate Termination: Fatal errors stop the crawler instantly. Review and resolve these issues before restarting.

Fatal error codes:

ERR::SCRAPE::PROJECT_QUOTA_LIMIT_REACHED - Your project has reached its API credit limit
ERR::SCRAPE::QUOTA_LIMIT_REACHED - Your account has reached its API credit limit
ERR::THROTTLE::MAX_API_CREDIT_BUDGET_EXCEEDED - Monthly budget exceeded
ERR::ACCOUNT::PAYMENT_REQUIRED - Payment required to continue service
ERR::ACCOUNT::SUSPENDED - Account suspended

What happens when a fatal error occurs:

Crawler stops immediately (no new URLs are crawled)
URLs already crawled are saved with their results
Crawler status transitions to completed or failed
Error details are included in the crawler response

Throttle Errors - Automatic Pause

These errors trigger an automatic 5-second pause before the crawler continues. This prevents overwhelming your account limits or proxy resources while allowing the crawl to complete successfully.

Automatic Recovery: The crawler pauses for 5 seconds when throttle errors occur, then resumes automatically. This is normal behavior and helps your crawl complete successfully.

Throttle error codes:

ERR::THROTTLE::MAX_REQUEST_RATE_EXCEEDED - Request rate limit exceeded
ERR::THROTTLE::MAX_CONCURRENT_REQUEST_EXCEEDED - Concurrent request limit exceeded
ERR::PROXY::RESOURCES_SATURATION - Proxy pool temporarily saturated
ERR::SESSION::CONCURRENT_ACCESS - Session concurrency limit reached

What happens during throttling:

Crawler pauses for 5 seconds
Failed URL is added back to the queue for retry
Crawler continues with next URLs after pause
Process repeats if throttle error occurs again

{
  "status": "running",
  "urls_crawled": 47,
  "urls_pending": 153,
  "recent_event": "Throttle pause: MAX_REQUEST_RATE_EXCEEDED - resuming in 5s"
}

High Failure Rate Protection

For certain error types (anti-scraping protection and internal errors), the crawler monitors the failure rate and automatically stops if it becomes too high. This prevents wasting credits on a crawl that's unlikely to succeed.

Smart Monitoring: The crawler tracks failure rates for ASP and internal errors. If 70% or more of the last 10 scrapes fail, the crawler stops automatically to protect your credits.

Monitored error codes:

ERR::ASP::SHIELD_ERROR - Anti-scraping protection error
ERR::ASP::SHIELD_PROTECTION_FAILED - Failed to bypass anti-scraping protection
ERR::API::INTERNAL_ERROR - Internal API error

Failure rate threshold:

Monitoring window: Last 10 scrape requests
Threshold: 70% failure rate (7 or more failures out of 10)
Action: Crawler stops immediately to prevent credit waste
Reason: Indicates systematic issue (website blocking, ASP changes, API issues)

{
  "status": "failed",
  "urls_crawled": 15,
  "urls_failed": 12,
  "error": {
    "code": "ERR::CRAWLER::HIGH_FAILURE_RATE",
    "message": "Crawler stopped: High failure rate detected (8/10 requests failed)",
    "details": {
      "failure_rate": 0.80,
      "threshold": 0.70,
      "recent_errors": ["ERR::ASP::SHIELD_ERROR", "ERR::ASP::SHIELD_PROTECTION_FAILED"]
    }
  }
}

How to handle high failure rate stops:

Review error logs: Check which specific errors are occurring most frequently
ASP errors: The target site may have updated their protection - contact support for assistance
Adjust configuration: Try different asp settings, proxy pools, or rendering options
Wait and retry: Some sites have temporary blocks that clear after a period
Contact support: If issues persist, our team can help analyze and resolve ASP challenges

Error Statistics & Monitoring

When a crawler completes (successfully or due to errors), comprehensive error statistics are logged and available for analysis. This helps you understand what went wrong and how to improve future crawls.

Statistics tracked:

Total errors encountered
Breakdown by error code (e.g., 3x ERR::THROTTLE::MAX_REQUEST_RATE_EXCEEDED)
Fatal errors that stopped the crawler
Throttle events and pause counts
High failure rate trigger details

{
  "crawler_id": "abc123...",
  "status": "completed",
  "urls_crawled": 847,
  "urls_failed": 23,
  "error_summary": {
    "total_errors": 23,
    "by_code": {
      "ERR::THROTTLE::MAX_REQUEST_RATE_EXCEEDED": 15,
      "ERR::PROXY::CONNECTION_TIMEOUT": 5,
      "ERR::ASP::SHIELD_ERROR": 3
    },
    "throttle_pauses": 15,
    "fatal_stops": 0,
    "high_failure_rate_stops": 0
  }
}

Accessing error details:

Crawler summary: Use GET /crawl/{uuid} to view overall error statistics
Failed URLs: Use GET /crawl/{uuid}/urls?status=failed to retrieve specific failed URLs with error codes
Logs: Check your crawler logs for detailed error tracking information

Inherited Web Scraping API Errors

Since the Crawler API makes individual scraping requests for each page crawled, it can return any error from the Web Scraping API. Each page crawled follows the same error handling as a single scrape request.

Important: When a page fails to crawl, the error details are stored in the crawl results. You can retrieve failed URLs and their error codes using the /crawl/{uuid}/urls?status=failed endpoint.

Common inherited errors by category:

Scraping Errors

ERR::SCRAPE::BAD_PROTOCOL

The protocol is not supported only http:// or https:// are supported

Retryable: No
HTTP status code: 422
Documentation:
- Related Error Doc

ERR::SCRAPE::BAD_UPSTREAM_RESPONSE

The website you target respond with an unexpected status code (>400)

Retryable: No
HTTP status code: 200
Documentation:
- Related Error Doc

ERR::SCRAPE::CONFIG_ERROR

Scrape Configuration Error

Retryable: No
HTTP status code: 400
Documentation:
- Getting Started
- Related Error Doc

ERR::SCRAPE::COST_BUDGET_LIMIT

Cost budget has been reached, you must increase the budget to pass this target

Retryable: Yes
HTTP status code: 422
Documentation:
- Checkout ASP documentation
- Related Error Doc

ERR::SCRAPE::COUNTRY_NOT_AVAILABLE_FOR_TARGET

Country not available

Retryable: No
HTTP status code: 422
Documentation:
- Related Error Doc

ERR::SCRAPE::DNS_NAME_NOT_RESOLVED

The DNS of the targeted website is not resolving or not responding

Retryable: No
HTTP status code: 422
Documentation:
- Related Error Doc

ERR::SCRAPE::DOMAIN_NOT_ALLOWED

The Domain targeted is not allowed or restricted

Retryable: No
HTTP status code: 422
Documentation:
- Related Error Doc

ERR::SCRAPE::DOM_SELECTOR_INVALID

The DOM Selector is invalid

Retryable: No
HTTP status code: 422
Documentation:
- Javascript Documentation
- Related Error Doc

ERR::SCRAPE::DOM_SELECTOR_INVISIBLE

The requested DOM selected is invisible (Mostly issued when element is targeted for screenshot)

Retryable: No
HTTP status code: 422
Documentation:
- Javascript Documentation
- Related Error Doc

ERR::SCRAPE::DOM_SELECTOR_NOT_FOUND

The requested DOM selected was not found in rendered content within 15s

Retryable: No
HTTP status code: 422
Documentation:
- Javascript Documentation
- Related Error Doc

ERR::SCRAPE::DRIVER_CRASHED

Driver used to perform the scrape can crash for many reason

Retryable: Yes
HTTP status code: 422
Documentation:
- Related Error Doc

ERR::SCRAPE::DRIVER_INSUFFICIENT_RESOURCES

Driver do not have enough resource to render the page correctly

Retryable: Yes
HTTP status code: 422
Documentation:
- Related Error Doc

ERR::SCRAPE::DRIVER_TIMEOUT

Driver timeout - No response received

Retryable: Yes
HTTP status code: 422
Documentation:
- Related Error Doc

ERR::SCRAPE::FORMAT_CONVERSION_ERROR

Response format conversion failed, unsupported input content type

Retryable: No
HTTP status code: 422
Documentation:
- API Format Parameter
- Related Error Doc

ERR::SCRAPE::JAVASCRIPT_EXECUTION

The javascript to execute goes wrong, please read the associated message to figure out the problem

Retryable: No
HTTP status code: 422
Documentation:
- Checkout Javascript Rendering Documentation
- Related Error Doc

ERR::SCRAPE::NETWORK_ERROR

Network error happened between Scrapfly server and remote server

Retryable: Yes
HTTP status code: 422
Documentation:
- Related Error Doc

ERR::SCRAPE::NETWORK_SERVER_DISCONNECTED

Server of upstream website closed unexpectedly the connection

Retryable: No
HTTP status code: 422
Documentation:
- Related Error Doc

ERR::SCRAPE::NO_BROWSER_AVAILABLE

No browser available in the pool

Retryable: Yes
HTTP status code: 422
Documentation:
- Related Error Doc

ERR::SCRAPE::OPERATION_TIMEOUT

This is a generic error for when timeout occur. It happened when internal operation took too much time

Retryable: Yes
HTTP status code: 504
Documentation:
- Related Error Doc
- Timeout Documentation

ERR::SCRAPE::PLATFORM_NOT_AVAILABLE_FOR_TARGET

Platform not available

Retryable: No
HTTP status code: 422
Documentation:
- Related Error Doc

ERR::SCRAPE::PROJECT_QUOTA_LIMIT_REACHED

The limit set to the current project has been reached

Retryable: Yes
HTTP status code: 429
Documentation:

ERR::SCRAPE::QUOTA_LIMIT_REACHED

You reach your scrape quota plan for the month. You can upgrade your plan if you want increase the quota

Retryable: No
HTTP status code: 429
Documentation:

ERR::SCRAPE::SCENARIO_DEADLINE_OVERFLOW

Submitted scenario would require more than 30s to complete

Retryable: No
HTTP status code: 422
Documentation:

ERR::SCRAPE::SCENARIO_EXECUTION

Javascript Scenario Failed

Retryable: No
HTTP status code: 422
Documentation:
- Related Error Doc

ERR::SCRAPE::SCENARIO_TIMEOUT

Javascript Scenario Timeout

Retryable: Yes
HTTP status code: 422
Documentation:

ERR::SCRAPE::SSL_ERROR

Upstream website have SSL error

Retryable: No
HTTP status code: 422
Documentation:
- Related Error Doc

ERR::SCRAPE::TOO_MANY_CONCURRENT_REQUEST

You reach concurrent limit of scrape request of your current plan or project if you set a concurrent limit at project level

Retryable: Yes
HTTP status code: 429
Documentation:
- Quota Pricing
- Related Error Doc

ERR::SCRAPE::UNABLE_TO_TAKE_SCREENSHOT

Unable to take screenshot

Retryable: Yes
HTTP status code: 422
Documentation:
- Related Error Doc

ERR::SCRAPE::UPSTREAM_TIMEOUT

The website you target made too much time to response

Retryable: No
HTTP status code: 422
Documentation:
- Related Error Doc

ERR::SCRAPE::UPSTREAM_WEBSITE_ERROR

The website you tried to scrape have configuration or malformed response

Retryable: Yes
HTTP status code: 422
Documentation:
- Related Error Doc

Proxy Errors

ERR::PROXY::POOL_NOT_AVAILABLE_FOR_TARGET

The desired proxy pool is not available for the given domain - mostly well known protected domain which require at least residential networks

Retryable: No
HTTP status code: 422
Documentation:

ERR::PROXY::POOL_NOT_FOUND

Provided Proxy Pool Name do not exists

Retryable: No
HTTP status code: 400
Documentation:

ERR::PROXY::POOL_UNAVAILABLE_COUNTRY

Country not available for given proxy pool

Retryable: No
HTTP status code: 400
Documentation:

ERR::PROXY::RESOURCES_SATURATION

Proxy are saturated for the desired country, you can try on other countries. They will come back as soon as possible

Retryable: Yes
HTTP status code: 422
Documentation:
- Related Error Doc

ERR::PROXY::TIMEOUT

Proxy connection or website was too slow and timeout

Retryable: Yes
HTTP status code: 422
Documentation:
- Related Error Doc
- Timeout Documentation

ERR::PROXY::UNAVAILABLE

Proxy is unavailable - The domain (mainly gov website) is restricted, You are using session feature and the proxy is unreachable at the moment

Retryable: Yes
HTTP status code: 422
Documentation:

Throttle Errors

ERR::THROTTLE::MAX_API_CREDIT_BUDGET_EXCEEDED

Your scrape request has been throttled. API Credit Budget reached. If it's not expected, please check your throttle configuration for the given project and env.

Retryable: Yes
HTTP status code: 429
Documentation:
- API Documentation
- Related Error Doc

ERR::THROTTLE::MAX_CONCURRENT_REQUEST_EXCEEDED

Your scrape request has been throttled. Too many concurrent access to the upstream. If it's not expected, please check your throttle configuration for the given project and env.

Retryable: Yes
HTTP status code: 429
Documentation:
- Related Error Doc
- Throttler Documentation

ERR::THROTTLE::MAX_REQUEST_RATE_EXCEEDED

Your scrape request as been throttle. Too much request during the 1m window. If it's not expected, please check your throttle configuration for the given project and env

Retryable: Yes
HTTP status code: 429
Documentation:
- Related Error Doc
- Throttler Documentation

Anti Scraping Protection (ASP) Errors

ERR::ASP::CAPTCHA_ERROR

Something wrong happened with the captcha. We will figure out to fix the problem as soon as possible

Retryable: Yes
HTTP status code: 422
Documentation:
- Related Error Doc

ERR::ASP::CAPTCHA_TIMEOUT

The budgeted time to solve the captcha is reached

Retryable: Yes
HTTP status code: 422
Documentation:
- Related Error Doc

ERR::ASP::SHIELD_ERROR

The ASP encounter an unexpected problem. We will fix it as soon as possible. Our team has been alerted

Retryable: No
HTTP status code: 422
Documentation:
- Checkout ASP documentation
- Related Error Doc

ERR::ASP::SHIELD_EXPIRED

The ASP shield previously set is expired, you must retry.

Retryable: Yes
HTTP status code: 422

ERR::ASP::SHIELD_NOT_ELIGIBLE

The feature requested is not eligible while using the ASP for the given protection/target

Retryable: No
HTTP status code: 422
Documentation:
- Related Error Doc

ERR::ASP::SHIELD_PROTECTION_FAILED

The ASP shield failed to solve the challenge against the anti scrapping protection

Retryable: Yes
HTTP status code: 422
Documentation:
- Checkout ASP documentation
- Related Error Doc

ERR::ASP::TIMEOUT

The ASP made too much time to solve or respond

Retryable: Yes
HTTP status code: 422
Documentation:
- Checkout ASP documentation
- Related Error Doc

ERR::ASP::UNABLE_TO_SOLVE_CAPTCHA

Despite our effort, we were unable to solve the captcha. It can happened sporadically, please retry

Retryable: Yes
HTTP status code: 422
Documentation:
- Related Error Doc

ERR::ASP::UPSTREAM_UNEXPECTED_RESPONSE

The response given by the upstream after challenge resolution is not expected. Our team has been alerted

Retryable: No
HTTP status code: 422
Documentation:
- Related Error Doc

Webhook Errors

ERR::WEBHOOK::DISABLED

Given webhook is disabled, please check out your webhook configuration for the current project / env

Retryable: No
HTTP status code: 400
Documentation:
- Checkout Webhook Documentation
- Related Error Doc

ERR::WEBHOOK::ENDPOINT_UNREACHABLE

We were not able to contact your endpoint

Retryable: Yes
HTTP status code: 422
Documentation:
- Checkout Webhook Documentation
- Related Error Doc

ERR::WEBHOOK::QUEUE_FULL

You reach the maximum concurrency limit

Retryable: Yes
HTTP status code: 429
Documentation:
- Checkout Webhook Documentation
- Related Error Doc

ERR::WEBHOOK::MAX_RETRY

Maximum retry exceeded on your webhook

Retryable: No
HTTP status code: 429
Documentation:
- Checkout Webhook Documentation
- Related Error Doc

ERR::WEBHOOK::NOT_FOUND

Unable to find the given webhook for the current project / env

Retryable: No
HTTP status code: 400
Documentation:
- Checkout Webhook Documentation
- Related Error Doc

ERR::WEBHOOK::QUEUE_FULL

You reach the limit of scheduled webhook - You must wait pending webhook are processed

Retryable: Yes
HTTP status code: 429
Documentation:
- Checkout Webhook Documentation
- Related Error Doc

Session Errors

ERR::SESSION::CONCURRENT_ACCESS

Concurrent access to the session has been tried. If your spider run on distributed architecture, the same session name is currently used by another scrape

Retryable: Yes
HTTP status code: 429
Documentation:
- Checkout Session Documentation
- Related Error Doc

ERR::SESSION::PROXY_POOL_MISMATCH

Session was created with a different proxy pool than requested. Sessions are bound to their original proxy pool and cannot switch between datacenter/residential pools.

Retryable: No
HTTP status code: 422
Documentation:
- Proxy Pool Documentation
- Session Documentation

For complete details on each inherited error, see the Web Scraping API Error Reference.

HTTP Status Codes

Status Code	Description
`200 OK`	Request successful
`201 Created`	Crawler job created successfully
`400 Bad Request`	Invalid parameters or configuration
`401 Unauthorized`	Invalid or missing API key
`403 Forbidden`	API key doesn't have permission for this operation
`404 Not Found`	Crawler job UUID not found
`422 Request Failed`	Request was valid but execution failed
`429 Too Many Requests`	Rate limit or concurrency limit exceeded
`500 Server Error`	Internal server error
`504 Timeout`	Request timed out

Error Response Format

All error responses include detailed information in a consistent format:

{
  "error": {
    "code": "CRAWLER_TIMEOUT",
    "message": "Crawler exceeded maximum duration of 3600 seconds",
    "retryable": false,
    "details": {
      "max_duration": 3600,
      "elapsed_duration": 3615,
      "urls_crawled": 847
    }
  }
}

Error response headers:

X-Scrapfly-Error-Code - Machine-readable error code
X-Scrapfly-Error-Message - Human-readable error description
X-Scrapfly-Error-Retryable - Whether the operation can be retried