Understand how timeout is working

You must configure your HTTP client read timout to 155s to avoid any issues. Scrapfly will manage the timeout according the strategy used. If you specify a custom timeout value, add 5s to your read timeout - screenshot, debug, cache add some overhead

Timeout configuration allow you set a deadline when you start a scrape. In that way you can ensure the scrape will not take more time than the defined timeout, Scrapfly will stop and return an error.

Time management is crucial in web scraping in order to recover as fast as possible. Everything steps are budgeted and tracked to prevent and recover from issue as fast as possible and provide the best reliability. Some scrapes are fast <5s, some other can require more time when rendering javascript ~25s or even more when using complex user scenario ~90s

To be able to customize a timeout, retry must be disable retry=false.

When Should I configure Timeout

If you are in the one of the following case:

  • I want fast reply when it's going wrong to retry or manage it myself
  • I scrape a slow target that sometimes fall in timeout
  • I play a JS scenario which require more timeout
timout diagram
Always +5s to your client read timeout when you customize the scrape timeout.
If you disable retry while using ASP, the default timeout is 30s. However, regarding some targeting that require are quite slow to pass, we recommend to increase the timeout to 60s as minimum. Below 60s, there is a high chance that on slow website or challenge, our system is not able to recover, rotate and bypass again. It will result by a blocked scrape on your end.

Usage Example

  • Question: I want to run a javascript scenario that require 90s in the worst case
  • Answer: Specify retry=false&timeout=90000, your http read timeout should be 95s
  • Question: I scrape a website without javascript and I want the lowest timeout as possible
  • Answer: Set the minimum allowed (no asp, no js rendering) 15s retry=false&timeout=15000, your http read timeout should be 20s

API Example

require "uri"
require "net/http"

url = URI("https://api.scrapfly.io/scrape?key=__API_KEY__&url=https%3A%2F%2Fhttpbin.org%2Fanything&retry=false&tags=player%2Cproject%3Adefault&timeout=15000")

https = Net::HTTP.new(url.host, url.port);
https.use_ssl = true

request = Net::HTTP::Get.new(url)

response = https.request(request)
puts response.read_body
"https://api.scrapfly.io/scrape?key=&url=https%3A%2F%2Fhttpbin.org%2Fanything&retry=false&tags=player%2Cproject%3Adefault&timeout=15000"

"api.scrapfly.io"
"/scrape"

key      = "" 
url      = "https://httpbin.org/anything" 
retry    = "false" 
tags     = "player,project:default" 
timeout  = "15000" 

Related Errors