Understanding Scrapfly Timeouts

View as markdown

Scrapfly's timeout configuration allows you to set a deadline for each scrape request. If a scrape doesn't complete within the defined timeout, it will be stopped and a Scrapfly error response will be returned.

Critical: Configure Your HTTP Client Timeout
For the best experience, configure your HTTP client with a minimum timeout of 155 seconds.
If you use a custom Scrapfly timeout, add +5s overhead to your client read timeout.

Quick Reference

Common timeout configurations for different scenarios:

Scenario	Scrapfly Parameters	Your HTTP Client Timeout
Default (Managed by Scrapfly) Best for most use cases	`retry=true` (default)	155s
Simple HTML Scraping No JavaScript, no ASP	`retry=false` `timeout=15000`	20s (15s + 5s overhead)
JavaScript Rendering Browser-based scraping	`retry=false` `timeout=30000`	35s (30s + 5s overhead)
Anti-Scraping Protection (ASP) Bypassing bot protection	`retry=false` `timeout=60000`	65s (60s + 5s overhead)
Complex JavaScript Scenarios Multi-step browser automation	`retry=false` `timeout=90000`	95s (90s + 5s overhead)

How Timeouts Work

Scrapfly scrape speeds depend on many factors including:

JavaScript rendering: Browser-based scraping takes longer than simple HTTP requests
JavaScript scenarios: Complex browser automation adds execution time
Anti-bot bypass: Solving CAPTCHAs and bypassing protection mechanisms requires additional time
Website performance: Slow or unresponsive websites naturally take longer to scrape

Typical scrape durations:

Simple scrapes: Less than 5 seconds
JavaScript rendering: 10-30 seconds
Complex scenarios or anti-bot bypass: 30-90 seconds

When Should I Configure Timeout?

Generally, it's best to trust Scrapfly's default timeout management (retry=true). However, custom timeouts are useful for:

Real-Time Scraping

When you need the fastest possible response and can accept failures, use lower timeouts to avoid waiting unnecessarily.

Slow Websites

For websites with heavy JavaScript or slow response times, increase the timeout to allow more time for completion.

Complex Automation

JavaScript scenarios with multiple steps (clicking, scrolling, form filling) require longer timeouts to complete all actions.

Anti-Bot Bypass

When using ASP with retry=false, increase timeout to at least 60 seconds to allow time for protection bypass.

Important: Custom timeout configuration requires retry=false. With retry=true (default), Scrapfly automatically manages timeouts for optimal results.

Timeout Requirements & Limits

Timeout requirements vary based on enabled features:

Configuration	Default Timeout	Minimum Allowed	Maximum Allowed
`asp=false, js=false`	15s	15s	30s
`asp=false, js=true` (no scenario)	30s	30s	60s
`asp=false, js=true` (with scenario)	30s	30s	90s
`asp=true`	30s	30s	150s

ASP + retry=false Recommendation: When using asp=true with retry=false, the default 30s timeout may not be sufficient. We recommend a minimum of 60 seconds to allow adequate time for anti-bot protection bypass.

Timeout Flow Visualization

This diagram shows how Scrapfly determines timeout values based on your configuration. Blue dashed boxes indicate configurable timeouts, while red boxes indicate timeouts managed automatically by Scrapfly.

Understanding the Diagram

retry=true (Right Path): Scrapfly automatically manages timeouts and retries. Your client timeout should be 155 seconds.
retry=false (Left Path): You control the timeout explicitly. Add +5s overhead to your client timeout.
Blue Dashed Boxes: Timeouts you can customize with the timeout parameter.
Red Boxes: Fixed timeouts managed by Scrapfly (with retry=true).

Usage Examples

To configure a custom scrape timeout, use retry=false and timeout=<milliseconds> query parameters.

Example: 20 Second Timeout

curl -G \
--request "GET" \
--url "https://api.scrapfly.io/scrape" \
--data-urlencode "retry=false" \
--data-urlencode "timeout=20000" \
--data-urlencode "key=__API_KEY__" \
--data-urlencode "url=https://httpbin.dev/delay/5"

https://api.scrapfly.io/scrape?retry=false&timeout=20000&key=&url=https%3A%2F%2Fhttpbin.dev%2Fdelay%2F5

Remember: Your HTTP client timeout should be 25 seconds (20s + 5s overhead) for this example.

Client Configuration Examples

Python client with 95s timeout

Node.js client with 95s timeout

PHP client with 95s timeout

Frequently Asked Questions

Configuration:

Scrapfly: retry=false&timeout=90000
Your HTTP client: 95 seconds (90s + 5s overhead)

This ensures your JavaScript scenario has the full 90 seconds to complete, and your client won't disconnect prematurely.

Configuration:

Scrapfly: retry=false&timeout=15000
Your HTTP client: 20 seconds (15s + 5s overhead)

Note: This only works when asp=false and render_js=false. 15 seconds is the minimum allowed timeout for simple HTTP scraping.

Yes. Custom timeout configuration requires retry=false. When retry=true (default), Scrapfly automatically manages timeouts and retries for optimal reliability.

Use retry=true when:

You want maximum reliability and don't mind longer wait times
You're scraping difficult targets with anti-bot protection
You want Scrapfly to handle retries automatically

Use retry=false when:

You need precise control over timeout durations
You're implementing your own retry logic
You need the fastest possible response (fail fast)

The +5s overhead accounts for:

Network latency: Time for request/response transmission
Processing overhead: Time for Scrapfly to process and package the response
Connection establishment: Initial connection setup time

Without this overhead, your client might disconnect before receiving Scrapfly's response, even if the scrape completed successfully within the timeout.

When a scrape exceeds the configured timeout:

The scrape operation is immediately stopped
A Scrapfly error response is returned
You'll receive one of the timeout-related error codes (see below)
No partial data is returned

Check the Related Errors section for specific timeout error codes and their meanings.

Related Errors

When a timeout occurs, you may encounter one of the following error codes. Click on each error for detailed information and troubleshooting steps.