Anti Scraping Protection (ASP)

overview page of web interface
ASP tab of log inspection
All features available below require ASP activated. With asp=true from API or by ticking ASP in API Player.

Introduction

It's a technology we developed to bypass anti scraping protection.

When ASP is triggered, it take the control to resolve, take decision to enable or disable JS rendering, enabling session, solving captcha, etc.

Once the challenge of the protection is resolved, ASP will be triggered each time you re visit the site to inject right things to avoid again a new challenge. It means, the first request to solve the challenge can take several seconds (regarding the challenge type, from 30s to 120s) once the first scrape done, next will be fast as regular.

You have nothing to do, ASP is fully managed by our services, automatically invoked if captcha or anti bot solution are detected on the website

We won't play cat and mouse game with anti bot solution, we will not explicitly enumerate services we are able to handle. We pass a lot of solution, from simple captcha to most advanced anti bot solution on the market. We also develop specific solution for dedicated popular website. If you want to know more, you can ask us via chat on the bottom left of the screen.

Each time ASP detect and resolve challenge (captcha or anti bot solution) a session is created, even if you don't have enabled a session. It's ensure all cookies are applied correctly without taking care of it. It will be invisible from your point of view.

You can find all related error from Error Section api can return about ASP

Captcha

Scrapfly ASP auto resolve many captcha system automatically

Following captcha system are currently supported :

  • Google Recaptcha
  • Hcaptcha
  • Geetest
More and more captcha type will be supported in future version

Example of result

If ASP is enabled and captcha is detected, you will find the answer of the captcha as following :

openapi openapi Captcha example result | Json
            {
    ...
    "context": {
        "asp": {
            "identity": "baea954b95731c68ae6e45bd1e252eb4560cdc45",
            "session_identity": "3375a5d49895472125d73bd5c89032afd0a24909",
            "shield_name": "captcha",
            "success": true,
            "error": null,
            "challenge": {
                "done": true,
                "success": true,
                "state": "solved",
                "context": {
                    "site_key": "6Le-wvkSAAAAAPBMRTvw0Q4Muexq9bi0DJwx_mJ-",
                    "token": "03AGdBq25At22qXblbpHjyOPzhoaRbyAMbpfDk17DJcCegDDf4it8zP8X2_6AsHDebS3yAAXN9AtwmDfikBDbPZlFdHA1d1O08X6sLp3yN7a6-nnjQ1XxHerksQb-xJ41p8dfTnO1CE8xr6GsHL9Y0uTUmv_9xFcgnpi1zlkRYCYQlUDg7JJAcSJxFHCPnm0J_aKk3LAQOyP8Lgw_zeYRrY6bzjaYGh9_5Yi8F73Z7-qyQWijIWExJansQsArdHCR8e5HaMVe3Lfe07evFXRDw6_7NaIMtj8hyctRMD-GKvCZlwCC6vs-rJtQtuRnxJKahZmhHkXZvpvqo8tzcKIOSCDLrHUwdm8j5m11G0UbEIARqyeZukX19B9l8MnAhP05Qu3STrgD3R8Mkqn0RCLdPdzpKGTqyYm5GLuzuV9LLSsWVuS_aYcssGKSUbrZHOikfP_dC4erg6N6FsJ1Jt7d4UbiGIXgbAKzRQsFDvmYaEgwl1lVd3WhiMIEh6NCUabS3qnDOdPBy8ewaceb3Opw9brEngV0fan3A3Bn3K3vbperKk2wxmWqKfbk90ua-ErIt2ygopD8f5z8mdib6aJvp6SDStJrcmu2AnhdA3eL0NubUi4nsTGpjlmXotm1MXVHMzWKNHoh5W0XLrSc3nYIKbmmNS4XrHT_wan2Kudz9icCew2v7EZHTmjFlJmVg_RdeXDWkiUEf5KuQtrsrpv195OyyUv_ucmi36Bg04dF45e5-cgp8Svu-sU5q9LHxkcu4wzRM7bqOXqmPLaDx2feERrjPwx6zyYi-O0xAP5xzCwul-VMCR3es-pWfr6ovrd0YiXpZ2L-9KpXLlJD0Hq3y9kezRLSq_xhoRsU8IvptN5jI612G8LMuoorkZLnKsiZNEzmuUoNXPbvhaSlloDRAHPchuBnfzSUtEjTO3WBZ9Qto5xvkR-kYQS_V6KKeYdPUNV6Dvemg-XZX81M2gzt8O4pt2MlG2DyKzv_DEZlf65bzT9G2_sBGwLtUvIGC7cDWolBiQJxoEfOJXI68oXpLyZLx-HZX5BGSbvc29ShuccRLrzAx55N4x6-vjDSOybkz4ZsJURHPqdge6jDGSV-TIlDVcCnxDLdFBs3F52vnWl9WdpHhVGAwJQh590LmBJj4C2kny-1XRFOKkLflNLVZLUmiBH0rqXQUnT1ybNxBDHWXj2wlDp1UD4HVHDbnpXIvAG9JquXqhtelQmAysORyItGrJGy5HYOY_VM6ALWjn0behuy6yRD-KzWEVV_WE_mP3vJs_kaNyW9EswUKY0hirdbB2aZ8sQMGy2bSFekb5aKpcqOuvYWN6v-7BKN_6St_MsO0A-CMZW4-hpvK-AVvaydF3ljKLqMsd_hAyL2yhytpsRgVJVx6HfgZPvwkUQwz2FljeUhWCYLxzIJ9_Jvd2MEOWnj2neg69HG-gDkCJyKztOo16mo5Tew",
                    "url": "https://www.google.com/recaptcha/api2/demo",
                    "user_agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36"
                },
                "type": "recaptcha",
                "result": {
                    "solution": {
                        "token": "03AGdBq241V6gvHLko7LO8UdhqTPMADuN0-QqrD8fycF4F6h8ioJ9PMF4D2LTcNSeyukdLJ-qrBIrotjnwl3h9dDpp0GyzXpom_b2VQQYrBs-sjv9uUttqQ_wIG0P0yOygrmOc7iZRhnfdF4Nr1Rd_UXXuXFBBHzLxlcid4PsZu5rWF9R1LDsD1Fhyhy6F6gQKk49gpDa-5DOtZ_yNsJaLTjvR8sNPwCJCf_71OHKe6iir3QHAb256_EOE5QjWK36gpzjpMeLNNQz6eZ6sZ4y2wkyff3dGQZ7MuWP65OIoLQJmsdGSKNerGXNZe8z7YqWi_CT4h2x6nOpPbylCH11lyKjniCo6PSR3Ytwhqc2lLtQ5MdQcbdRMXk0SejTvihIloQr96YP7QVWyIRXvierh8Faxctw7j_OW9AblrBY3KxKsvpeK_n4zfIA0zp0HbDwYoKByUxjIt-qfNT0xmMXC4i4NWdziXm6fm36tuBgi-N35CrwwhbmKgXpB5mT4XyEtjG2wMkSS16Wg4nmCUFPo_F6obB3DfRoIjB7jQ-yswj-McwDIYuHpZgIBU1hYaIWyUTAjBn8PN9b8ZIx27ZVEI0L2e_EGvH174PNjAE5Lyn4OI6qaJLZJyXW0nyLdOxklkSkDgjJcT9stuRqGKlmhNohnTwEYoZ-gk3ijklyG1Jf6QK4CjyiXAirxLOtooDIQ0VpBHbMhbasNsej637UYXYYL9mpYeSNho9GNGPuanhqAfk3wAud5pmArfc0t8_qxkMcSyLc5ZICtEZWtTJGnEoKsniALvkNvIl9N-K2UOZg6JU7sPFyaypCxRLO2ybWPNoibXxJTuYdPH0KU8eGbBdyQUUk105XK2dfVBg4KEfvYJTm6NID-c-flixXGL697kyZOV_9Rn3RbGcGO1_AyhlOsmYJOg5r5FIvxn1hofmYJG99-rUYzFvj3p-3h09fNyEnTI2PGjwpXH9FezvZMCXMbvnEZ45zbxvJ7X8cMudooYADG7vViDQAF72WkGEIWR8lFdmAfgYzmRgzXwoBsRVeeiiVYczd9ImLTh4a7BCWg0TeLfO7ptCyXRokaSA8PhS1WhH_OR-ofgHng3zK3sh5MMvWIznm466mcYuyQxqbfzQHYxeyWlhjw3w7jFK_cpWxnZmXIon5waCCQwJWXDDFcqZDNspY5aJI1iANA7bgI2T1WUNfu2MaoICVgPO1Krn8j7cANz-f4na3S0pEYN02wo70wdY9Yof7P4fEz04OdvJsSq2ZtCs9LSCcm9gRNq8mVIIkdijHQyuL4AUAa4gq_ig_9tHu2BJ0J4EmAcJUy1jTWnwrZycSqDi9w3YEz8yLzIod3FuWCpcpU8aQ9eGGCQfWJXDMHwBjXv_CF-sWJ-R1WfzfcOo9j5XbHHH9sEYv80LFEBTfEtkklLlaymlmB2kBCvSWW3LQU1ZlUy0D-13mfvkWPdoc"
                    },
                    "error": null,
                    "duration": 102.1
                }
            }
        },
    },
    ...
}

        

Anti Bot solution

Scrapfly detect and resolve challenges from well known anti scraping solution on the market. Scrapfly also support custom solution on popular website.

Keep in mind anti bot solution evolve and we may need to adapt our bypass technics, this is why when relying on ASP you should handle ASP errors correctly.

ASP in Distributed application

In real world application, you might use ASP in distributed application (workers, multi thread, multi processing). ASP rely on Session and work as same as session in distributed application, based on correlation_id. You can see how distributed session works.

ASP is not working!

Unfortunately we cannot develop ASP for every site, some rely on well known system other have custom system. If ASP do not detect protection on website and you get blocked. If you are unable to bypass system, you can contact us via our service page. You can also contact us via the chat on bottom right corner.

Pricing

Following rules are applied when ASP is activated:

  • If ASP does not detect protection: No extra Scrape API calls are counted
  • If ASP need to solve a challenge (Captcha / Javascript) and generate a session for future usage: 30 Scrape API calls are counted
  • If ASP session already exist and still valid: No Extra Scrape API calls are counted
  • 404 error code are not considered as a failed Scrape: Extra Scrape API calls can be counted if master shield is invoke
  • ASP master shield can't be reused - For each call using this shield 30 Scrape API calls are counted
  • If ASP require to switch on residential network, network cost + ASP cost will be counted

In this way, if you are not sure if a website is protected, you can enable ASP, if nothing is blocking, no extra calls are counted

API Response contain header X-Scrapfly-Api-Cost indicate you the billed amount

Integration