Webhook
Scrapfly's webhook feature is ideal for managing long-running scrape tasks asynchronously.
When webhook is specified through the webhook_name
parameter, Scrapfly will call your HTTP endpoint with the scrape response as soon as the scrape is done.
To start using webhooks first one must be created using webhook web interface.
Then the webhook
scrape parameter can be used with the created webhook's name (e.g. webhook_name=example
) to enable webhook callbacks per scrape request basis.
The body sent to your endpoint is the same as a regular API scrape response, plus webhook information in the context part.
The webhook execution information can be found in the webhook tab of each scrape log page in the monitoring dashboard:
Webhook Queue SizeThe webhook queue size indicates the maximum number of queued webhooks that can be scheduled. After the scraping process is completed and your application is notified, the queue size is reduced. This allows you to schedule additional scrapes beyond the concurrency limit of your subscription. The scheduler will handle this and ensure that your concurrency limit is met.
FREE DISCOVERY PRO STARTUP ENTERPRISE 500 500 2000 5000 10000
Scope
Webhooks are scoped per scrapfly projects and environments. Make sure to create a webhook for each of your project and environment (test/live).
Retry Policy
Webhook callbacks are retried if Scrapfly can't notify the endpoint specified in your webhook settings based on this retry policy:
- 30 seconds
- 1 minute
- 5 minutes
- 30 minutes
- 1 hour
- 1 day
If we failed to reach your application more than 100 times in a row, the system automatically disables it, and you will be notified. You can re-enable it from the UI at any point after.
Development
Useful tools to develop locally :
- https://webhook.site Collect and display webhook
- https://ngrok.com Expose you local application through a secured tunnel to the internet
Security
A secret can be set for security when creating webhooks using the web dashboard. This secret will be included by Scrapfly
with each webhook callback through the X-Scrapfly-Webhook-Secret
header.
Headers
Following headers are added :
X-Scrapfly-Webhook-Env
: Related environment where webhook is triggeredX-Scrapfly-Webhook-Project
: Related project nameX-Scrapfly-Webhook-Secret
Related Secret to authenticate the origin of a callX-Scrapfly-Webhook-Name
Name of the webhook
Usage
The below examples assume you have a webhook named example registered via the web dashboard.
To enable webhook callbacks, all you need to do is specify the webhook_name
parameter in your scrape requests.
Then, Scrapfly will immediately return a promise response and call your webhook endpoint as soon as the scrape is done.
Note that your webhook has to be configured to respond to 2xx
response code for webhook to be considered a success.
The 3xx
redirect responses will be followed and response codes 4xx
and 5xx
are considered failures and will be retried as per the retry policy.
import requests
url = "https://api.scrapfly.io/scrape?webhook_name=example&key=__API_KEY__&url=https%3A%2F%2Fhttpbin.dev%2Fhtml"
response = requests.request("GET", url)
data = response.json()
print(data)
print(data['result'])
"https://api.scrapfly.io/scrape?webhook_name=example&key=&url=https%3A%2F%2Fhttpbin.dev%2Fhtml"
"api.scrapfly.io"
"/scrape"
webhook_name = "example"
key = ""
url = "https://httpbin.dev/html"
Example Of Response
Related Errors
All related errors are listed below. You can see the full description and example of the error response on Errors section of the documentation.
- ERR::WEBHOOK::DISABLED - Given webhook is disabled, please check out your webhook configuration for the current project / env
- ERR::WEBHOOK::ENDPOINT_UNREACHABLE - We were not able to contact your endpoint
- ERR::WEBHOOK::MAX_CONCURRENCY_REACHED - You reach the maximum concurrency limit
- ERR::WEBHOOK::MAX_RETRY - Maximum retry exceeded on your webhook
- ERR::WEBHOOK::NOT_FOUND - Unable to find the given webhook for the current project / env
- ERR::WEBHOOK::QUEUE_FULL - You reach the limit of scheduled webhook - You must wait pending webhook are processed
Pricing
No additional fee applied on usage.