Anti Scraping Protection (ASP)

overview page of web interface
ASP tab of log inspection
All features available below require ASP activation. With asp=true from API or by ticking ASP in API Player.

Introduction

It's a technology we developed to bypass anti-scraping protection.

When ASP is triggered, it takes control to resolve, deciding whether to enable or disable JS rendering, allowing the session to solve the captcha, and so forth.

Once the challenge to the protection is resolved, ASP will be triggered each time you revisit the site and inject the correct mechanisms to avoid a new challenge yet again. Therefore, the first request to solve the challenge can take several seconds (regarding the challenge type, from 30 to 120 seconds). Once the first scrape is done, the next will be as fast as usual.

You have nothing to do; our services will fully manage the ASP, automatically starting it if captcha or anti-bot solution are detected on the website.

We won't play cat and mouse games with anti-bot solutions. We will not explicitly enumerate services we can handle. We pass many solutions, from simple captcha to the most advanced anti-bot solution on the market. We also develop a specific solution for a dedicated popular website. If you want to know more, you can ask us via chat on the screen's bottom-left.

Each time ASP detects and resolves a challenge (captcha or anti-bot solution), a session is created, even if you don't have a session enabled. It ensures all cookies are applied correctly without taking care of them. It will be invisible from your point of view.

You can find all related error from Error Section api can return about ASP

Captcha

Scrapfly ASP auto-resolve many captcha systems automatically

Following captcha system are currently supported:

  • Google Recaptcha
  • Hcaptcha
  • Geetest
More and more captcha type will be supported in future versions

Example of the result

If ASP is enabled and captcha is detected, you will find the answer of the captcha as follows:

openapi openapi Captcha example result | Json
            {
    ...
    "context": {
        "asp": {
            "identity": "baea954b95731c68ae6e45bd1e252eb4560cdc45",
            "session_identity": "3375a5d49895472125d73bd5c89032afd0a24909",
            "shield_name": "captcha",
            "success": true,
            "error": null,
            "challenge": {
                "done": true,
                "success": true,
                "state": "solved",
                "context": {
                    "site_key": "6Le-wvkSAAAAAPBMRTvw0Q4Muexq9bi0DJwx_mJ-",
                    "token": "03AGdBq25At22qXblbpHjyOPzhoaRbyAMbpfDk17DJcCegDDf4it8zP8X2_6AsHDebS3yAAXN9AtwmDfikBDbPZlFdHA1d1O08X6sLp3yN7a6-nnjQ1XxHerksQb-xJ41p8dfTnO1CE8xr6GsHL9Y0uTUmv_9xFcgnpi1zlkRYCYQlUDg7JJAcSJxFHCPnm0J_aKk3LAQOyP8Lgw_zeYRrY6bzjaYGh9_5Yi8F73Z7-qyQWijIWExJansQsArdHCR8e5HaMVe3Lfe07evFXRDw6_7NaIMtj8hyctRMD-GKvCZlwCC6vs-rJtQtuRnxJKahZmhHkXZvpvqo8tzcKIOSCDLrHUwdm8j5m11G0UbEIARqyeZukX19B9l8MnAhP05Qu3STrgD3R8Mkqn0RCLdPdzpKGTqyYm5GLuzuV9LLSsWVuS_aYcssGKSUbrZHOikfP_dC4erg6N6FsJ1Jt7d4UbiGIXgbAKzRQsFDvmYaEgwl1lVd3WhiMIEh6NCUabS3qnDOdPBy8ewaceb3Opw9brEngV0fan3A3Bn3K3vbperKk2wxmWqKfbk90ua-ErIt2ygopD8f5z8mdib6aJvp6SDStJrcmu2AnhdA3eL0NubUi4nsTGpjlmXotm1MXVHMzWKNHoh5W0XLrSc3nYIKbmmNS4XrHT_wan2Kudz9icCew2v7EZHTmjFlJmVg_RdeXDWkiUEf5KuQtrsrpv195OyyUv_ucmi36Bg04dF45e5-cgp8Svu-sU5q9LHxkcu4wzRM7bqOXqmPLaDx2feERrjPwx6zyYi-O0xAP5xzCwul-VMCR3es-pWfr6ovrd0YiXpZ2L-9KpXLlJD0Hq3y9kezRLSq_xhoRsU8IvptN5jI612G8LMuoorkZLnKsiZNEzmuUoNXPbvhaSlloDRAHPchuBnfzSUtEjTO3WBZ9Qto5xvkR-kYQS_V6KKeYdPUNV6Dvemg-XZX81M2gzt8O4pt2MlG2DyKzv_DEZlf65bzT9G2_sBGwLtUvIGC7cDWolBiQJxoEfOJXI68oXpLyZLx-HZX5BGSbvc29ShuccRLrzAx55N4x6-vjDSOybkz4ZsJURHPqdge6jDGSV-TIlDVcCnxDLdFBs3F52vnWl9WdpHhVGAwJQh590LmBJj4C2kny-1XRFOKkLflNLVZLUmiBH0rqXQUnT1ybNxBDHWXj2wlDp1UD4HVHDbnpXIvAG9JquXqhtelQmAysORyItGrJGy5HYOY_VM6ALWjn0behuy6yRD-KzWEVV_WE_mP3vJs_kaNyW9EswUKY0hirdbB2aZ8sQMGy2bSFekb5aKpcqOuvYWN6v-7BKN_6St_MsO0A-CMZW4-hpvK-AVvaydF3ljKLqMsd_hAyL2yhytpsRgVJVx6HfgZPvwkUQwz2FljeUhWCYLxzIJ9_Jvd2MEOWnj2neg69HG-gDkCJyKztOo16mo5Tew",
                    "url": "https://www.google.com/recaptcha/api2/demo",
                    "user_agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36"
                },
                "type": "recaptcha",
                "result": {
                    "solution": {
                        "token": "03AGdBq241V6gvHLko7LO8UdhqTPMADuN0-QqrD8fycF4F6h8ioJ9PMF4D2LTcNSeyukdLJ-qrBIrotjnwl3h9dDpp0GyzXpom_b2VQQYrBs-sjv9uUttqQ_wIG0P0yOygrmOc7iZRhnfdF4Nr1Rd_UXXuXFBBHzLxlcid4PsZu5rWF9R1LDsD1Fhyhy6F6gQKk49gpDa-5DOtZ_yNsJaLTjvR8sNPwCJCf_71OHKe6iir3QHAb256_EOE5QjWK36gpzjpMeLNNQz6eZ6sZ4y2wkyff3dGQZ7MuWP65OIoLQJmsdGSKNerGXNZe8z7YqWi_CT4h2x6nOpPbylCH11lyKjniCo6PSR3Ytwhqc2lLtQ5MdQcbdRMXk0SejTvihIloQr96YP7QVWyIRXvierh8Faxctw7j_OW9AblrBY3KxKsvpeK_n4zfIA0zp0HbDwYoKByUxjIt-qfNT0xmMXC4i4NWdziXm6fm36tuBgi-N35CrwwhbmKgXpB5mT4XyEtjG2wMkSS16Wg4nmCUFPo_F6obB3DfRoIjB7jQ-yswj-McwDIYuHpZgIBU1hYaIWyUTAjBn8PN9b8ZIx27ZVEI0L2e_EGvH174PNjAE5Lyn4OI6qaJLZJyXW0nyLdOxklkSkDgjJcT9stuRqGKlmhNohnTwEYoZ-gk3ijklyG1Jf6QK4CjyiXAirxLOtooDIQ0VpBHbMhbasNsej637UYXYYL9mpYeSNho9GNGPuanhqAfk3wAud5pmArfc0t8_qxkMcSyLc5ZICtEZWtTJGnEoKsniALvkNvIl9N-K2UOZg6JU7sPFyaypCxRLO2ybWPNoibXxJTuYdPH0KU8eGbBdyQUUk105XK2dfVBg4KEfvYJTm6NID-c-flixXGL697kyZOV_9Rn3RbGcGO1_AyhlOsmYJOg5r5FIvxn1hofmYJG99-rUYzFvj3p-3h09fNyEnTI2PGjwpXH9FezvZMCXMbvnEZ45zbxvJ7X8cMudooYADG7vViDQAF72WkGEIWR8lFdmAfgYzmRgzXwoBsRVeeiiVYczd9ImLTh4a7BCWg0TeLfO7ptCyXRokaSA8PhS1WhH_OR-ofgHng3zK3sh5MMvWIznm466mcYuyQxqbfzQHYxeyWlhjw3w7jFK_cpWxnZmXIon5waCCQwJWXDDFcqZDNspY5aJI1iANA7bgI2T1WUNfu2MaoICVgPO1Krn8j7cANz-f4na3S0pEYN02wo70wdY9Yof7P4fEz04OdvJsSq2ZtCs9LSCcm9gRNq8mVIIkdijHQyuL4AUAa4gq_ig_9tHu2BJ0J4EmAcJUy1jTWnwrZycSqDi9w3YEz8yLzIod3FuWCpcpU8aQ9eGGCQfWJXDMHwBjXv_CF-sWJ-R1WfzfcOo9j5XbHHH9sEYv80LFEBTfEtkklLlaymlmB2kBCvSWW3LQU1ZlUy0D-13mfvkWPdoc"
                    },
                    "error": null,
                    "duration": 102.1
                }
            }
        },
    },
    ...
}

        

Anti Bot solution

Scrapfly detects and resolves challenges from well-known anti-scraping solutions on the market. Scrapfly also supports custom solutions on popular websites.

Keep in mind anti-bot solutions evolve, and we may need to adapt our bypass technics; this is why you should handle ASP errors correctly when relying on ASP.

ASP in Distributed application

In a real-world application, you might use ASP in a distributed application (workers, multi-thread, multiprocessing). ASP relies on Session and works the same as a distributed application session, based on correlation_id. You can see how distributed session works.

ASP is not working!

Unfortunately, we cannot develop ASP for every site; some rely on well-known systems others have custom systems. If ASP does not detect protection on a website and you get blocked, and you are unable to bypass the system, you can contact us via our service page. You can also contact us via chat in the bottom right corner.

Pricing

Following rules are applied when ASP is activated:

  • If ASP does not detect protection: No extra Scrape API calls are counted
  • If ASP needs to solve a challenge (Captcha/JavaScript) and generate a session for future usage: 30 Scrape API calls are counted
  • If ASP session already exists and is still valid: No Extra Scrape API calls are counted
  • 404 error code are not considered as a failed Scrape: Extra Scrape API calls can be counted if a master shield is invoked
  • ASP master shield can't be reused. For each call using this shield, 30 Scrape API calls are counted
  • If ASP requires switching on the residential network, then network cost + ASP cost will be counted

Therefore, if you are not sure if a website is protected, you can enable ASP. If nothing is blocking, no extra calls are counted.

API Response contains header X-Scrapfly-Api-Cost indicate you the billed amount

Integration