Frequently Asked Question
How to send a POST request
We have a dedicated documentation to explain it
How to send cookies or specific headers
We have a dedicated documentation to explain it
How to know the API credit billed and the detailed
If you directly want the total of API credit billed, you can check out the header X-Scrapfly-Api-Cost
. If you want to get the details,
you have the information in our JSON response response.context.cost
where you can find the detail and the total.
How to check the concurrent usage
If you have finished scrape request whereas the concurrent usage do not decreased, make sure your http client is correctly configured to timeout after 155s and the 30s by default.
Concurrency is dimension to measure the number of request made at the same time to our service, each plan have have it's own concurrency quota. As soon as a scrape is performed, your concurrency usage is increased by 1 and when you get the response it decreased by 1.
Each API response have these headers:
-
X-Scrapfly-Account-Concurrent-Usage
Indicate the current number of request in flight (Your scrape is awaiting the response), global to the account -
X-Scrapfly-Account-Remaining-Concurrent-Usage
Indicate the remaining concurrency usage, global to the account -
X-Scrapfly-Project-Concurrent-Usage
Indicate the current number of request in flight (Your scrape is awaiting the response), scoped to the project that belong to the API KEY -
X-Scrapfly-Project-Remaining-Concurrent-Usage
Indicate the remaining concurrency usage, scoped to the project that belong to the API KEY
Related Error: ERR::SCRAPE::TOO_MANY_CONCURRENT_REQUEST
How prevent extra usage
In case of upgrade or downgrade, any Extra API Credit Billed are not retro actively recomputed against the new plan and are not refundable.
API Response contains the header X-Scrapfly-Remaining-Api-Credit
which indicates you the amount of API count on your account.
If the value is 0 then you are in extra usage. You can also have account information (quota, concurrency and so on) via our Account API
Another way to prevent extra usage is to configure your project(s). You can define limitation on multiple dimensions:
- Scrape API credit limit
- Concurrency Limit
- Allow/Disallow Extra Usage
- Budget Limit of Extra Usage
How bypass protected website / unblock my scrape
You can check out the ASP feature and the API parameter to use it.
How get the scraped result directly
Yes, you can set proxified_response=true
as url parameter. Body and Headers are from upstream. You can
check out directly the parameter documentation
How get the page correctly rendered
If your page do not render what you expect here is the guideline to troubleshoot correctly :
- In your browser
CTRL+u
to see how the page is without the rendered javascript. If the expected content is not inside, you need javascript rendering - If the element take time to be rendered, you can try to setup rendering_wait or wait_for_selector
- Make sure page the page you scrape is not blocked - check out ASP feature to unblock it
How to download image or pdf
Yes, you can check the format of content via response['result']['format']
, text or binary. If the content is a binary format, the content is base64 encoded.
You must not use browser rendering when you want to download media/image, most of the time, regarding the content-type, the browser will load by generating
html document and load it through media html tag (img, video, audio)
Request with ASP are too expensive
We know it's frustrating to scrape protected website - but it also had a cost to bypass most of the time they require residential proxies and browser to pass challenge. Protection evolves and need to be updated - we have reverse engineer team dedicated to the that.
I'm Getting Read Timeout Error
The API read timeout is 155s by default. You must configure your http client to set the read timeout to 155. If you don't want this value
and want to avoid Read timeout
error, you must set retry=false
.
How API Credit does it cost
- By default (Datacenter without browser rendering): 1 API Credit
- With Browser Rendering: +5 API Credit
- With Residential Proxy Network: +25 API Credit
- Datacenter Proxy Network + Browser Rendering: 6 API Credit
- Residential Proxy Network + Browser Rendering: 30 API Credit
- Some domain can cost some extra API Credit - You can contact the support or try your target in the API Player
You can check out the dedicated pricing page.
The easiest way to project your cost and evaluate the budget is to test calls and reproduce it through or UI API player (no code required) and look at the cost - Each API Call have its own logs with all the details about the API Credit billed step by step.
How to cancel my subscription
You can cancel your subscription from your dashboard, click on account setting located on the right side of the top bar, then billing and then on the right side, the "plan" card, there is a cancel button, or you can go straight to the link. Cancellation keep your current subscription until the renewal date and downgrade to free.
How to control spending
- Project: Allow or not extra quota, Limit the spedingm limit the concurrency
- Throttler: Define per target an API Credit budget on diffrent period (hour, day, month)
-
API:
Using
cost_budget
parameter to define the maximum budget. If a scrape has been done and the budget interupt configuration mutation, the scrape that has been performed will be billed regardless of the status code. Make sure to define the correct minimum budget regarding your target otherwise if the budget is too low, you will never been able to pass and pay for blocked result.