Scrapfly gives you the ability to limit the pressure on an upstream website or budgeting your spend. You can throttle the concurrency (number of simultaneous requests), the rate (maximum number of requests per window time) and the budget spent on the target (per hour, day, month)
Some websites monitor the traffic and apply a rate limit. Most of the time, you will need to throttle when you scrape content as an identified user (OAuth, JWT, token, and the like). In that case, changing IP will not affect the rate limit since you’re identified via your identity.
The throttler feature's idea is to respond to the need to manage throttling at a distributed level. Again, throttling is a well-known problem but costs time and effort to bring a production-grade system. It's integrated into Scrapfly API, ready and easy to use with no additional cost.
Throttles are scoped by environment and project.
When multiple throttles match the host pattern and priority, only the throttle with the highest priority is selected. So ultimately, only one throttle is chosen based on these criteria.
The rate limiter is using the sliding window algorithm. As soon as the request is out of windows, it's released. It means you smoothly retrieve your quota over time instead of waiting the whole period to recover the total quota.
Control the amount of API Credit spent on the target, you can define a buget on diffrent period, per hour, per day, per month
All related errors are listed below. You can see full description and example of error response on Errors section