Python SDK

You have question?

Dev Support - Ask on stack overflow
Ensure to have following tags scrapfly web-scraping python


Python SDK gives you a handy abstraction to interact with Scrapfly API. Many are automatically handled for you like:

The Full python API specification is available here:

  • Automatic base64 encode of JS snippet
  • Error Handling
  • Body json encode if Content-Type: application/json
  • Body URL encode and set Content Type: application/x-www-form-urlencoded if no content type specified
  • Convert Binary response into a python ByteIO object


Source code of Python SDK is available on Github scrapfly-sdk package is available through PyPi.

pip install 'scrapfly-sdk'

You can also install extra package scrapfly[speedups] to get brotli compression and msgpack serialization benefits.

pip install 'scrapfly-sdk[speedups]'


Step by step guide

Follow the step by step guide with practical example covering Scrapfly major features

Discover now
from scrapfly import ScrapeConfig, ScrapflyClient, ScrapeApiResponse

scrapfly = ScrapflyClient(key='')

api_response:ScrapeApiResponse = scrapfly.scrape(scrape_config=ScrapeConfig(url=''))

# Automatic retry errors marked "retryable" and wait delay recommended before retrying
api_response:ScrapeApiResponse = scrapfly.resilient_scrape(scrape_config=ScrapeConfig(url=''))

# Automatic retry error based on status code
api_response:ScrapeApiResponse = scrapfly.resilient_scrape(scrape_config=ScrapeConfig(url=''), retry_on_status_code=[500])

# scrape result, content, iframes, response headers, response cookies states, screenshots, ssl, dns etc

# html content

# Context of scrape, session, webhook, asp, cache, debug

# raw api result

# True if the scrape respond with >= 200 < 300 http status

# Api status code /!\ Not the api status code of the scrape!

# Upstream website status code

# Convert API Scrape Result into well known requests.Response object

Discover python full specification:

Using Context

from scrapfly import ScrapeConfig, ScrapflyClient, ScrapeApiResponse

scrapfly = ScrapflyClient(key='')

with scrapfly as scraper:
    response:ScrapeApiResponse = scraper.scrape(ScrapeConfig(url='', country='fr'))

Download Binary Response

from scrapfly import ScrapflyClient, ScrapeApiResponse

api_response:ScrapeApiResponse = scrapfly.scrape(scrape_config=ScrapeConfig(url=''))
scrapfly.sink(api_response) # you can specify path and name via named arguments

Error Handling

Error handling is a big part of scraper, so we design a system to reflect what happened when it's going bad to handle it properly from Scraper. Here a simple snippet to handle errors on your owns

from scrapfly import ScrapeConfig, ScrapflyClient, ScrapeApiResponse, UpstreamHttpClientError, \
ScrapflyScrapeError, UpstreamHttpServerError

scrapfly = ScrapflyClient(key='')

    api_response:ScrapeApiResponse = scrapfly.scrape(scrape_config=ScrapeConfig(
except UpstreamHttpClientError as e: # HTTP 400 - 500
    raise e
except UpstreamHttpServerError as e:  # HTTP >= 500
    raise e
# UpstreamHttpError can be used to catch all related error regarding the upstream website
except ScrapflyScrapeError as e:
    raise e

Errors with related code and explanation are documented and available here, if you want to know more.

error.message              # Message
error.code                 # Error code of error
error.retry_delay         # Recommended time wait before retrying if retryable
error.retry_times         # Recommended retry times if retryable
error.resource            # Related resource, Proxy, ASP, Webhook, Spider
error.is_retryable        # True or False
error.documentation_url   # Documentation explaining the error in details
error.api_response        # Api Response object
error.http_status_code    # Http code

By default, if the upstream website that you scrape responds with bad HTTP code, the SDK will raise UpstreamHttpClientError or UpstreamHttpServerError regarding the HTTP status code. You can disable this behavior by setting the raise_on_upstream_error attribute to false. ScrapeConfig(raise_on_upstream_error=False)

If you want to report to your app for monitoring / tracking purpose on your side, checkout reporter feature.


You can retrieve account information

from scrapfly import ScrapflyClient

scrapfly = ScrapflyClient(key='')

Keep Alive HTTP Session

Take benefits of Keep-Alive Connection

from scrapfly import ScrapeConfig, ScrapflyClient, ScrapeApiResponse

scrapfly = ScrapflyClient(key='')

with scrapfly as client:
    api_response:ScrapeApiResponse = scrapfly.scrape(scrape_config=ScrapeConfig(
            'main': 'fullpage'

    // more scrape calls

Concurrency out of the box

You can run scrape concurrently out of the box. We use asyncio for that.

In python, there are many ways to achieve concurrency. You can also check:

First of all, ensure you have installed concurrency module

pip install 'scrapfly-sdk[concurrency]'

import asyncio

import logging as logger
from sys import stdout

scrapfly_logger = logger.getLogger('scrapfly')

from scrapfly import ScrapeConfig, ScrapflyClient, ScrapeApiResponse

scrapfly = ScrapflyClient(key='', max_concurrency=2)

async def main():
    results = await scrapfly.concurrent_scrape(scrape_configs=[
        ScrapeConfig(url='', render_js=True),
        ScrapeConfig(url='', render_js=True),
        ScrapeConfig(url='', render_js=True),
        ScrapeConfig(url='', render_js=True),
        ScrapeConfig(url='', render_js=True),
        ScrapeConfig(url='', render_js=True),
        ScrapeConfig(url='', render_js=True),
        ScrapeConfig(url='', render_js=True)


Prevent Extra Usage

If you want to prevent extra usage

from scrapfly import ScrapeConfig, ScrapflyClient, ScrapeApiResponse

url = ''

scrapfly = ScrapflyClient(key='')

response = scrapfly.scrape(scrape_config=ScrapeConfig(url=url))

response.prevent_extra_usage() # raise scrapfly.errors.ExtraUsageForbidden