Session

overview page of web interface
Session tab of log inspection

Introduction

Session allow you to keep consistent navigation across scrape. Session track visits, referer, persist. If javascript rendering is activated, it also persist and restore window.localStorage and window.sessionStorage. By default session are being stuck to the same proxy.

Session are useful to keep consistent navigation behavior and stay under bot detection radar. Anti Scraping Protection auto activate and configure session for you in order to persist cookies after challenge.

When inspecting log from web interface you can see all details of session such as navigation history, stored cookies, referer and all metadata (creation date, last used date, expiration date)

Sharing Policy

Session sharing is following multiple rules in order to be consistent and avoid miss conception.

Session is not shared across project and environment, they are isolated from each other.

Session are shared even if you render javascript or not, so you can mix navigation and take benefit of better performance by avoiding javascript rendering when it's not necessary.

You must avoiding collision in session name, you may reuse a previous session.

Eviction Policy

Session automatically expire after 7 days, each time session is reused expiration date is resetted to 7 days.

Distributed session

Schema of distributed session explained

In real world application you will want to parallelize scrape call which is not compatible with session behavior. If each worker / process / thread are insulated to each other (behavior of one do not interfere on other), distributed session will help you.

To use distributed session, you simply have to specify the correlation_id with the unique identity of current worker. Could be a hostname in case of worker, pid in case of multi processing, thread_id in case threaded application.

Now session are insulated by processor in it's own world and other processor which also use the session do not effect yours and keep coherent navigation flow / referer.

By default, when session are used, proxy are attached on it. When a distributed session is created, soft anti affinity strategy is applied on proxy attachment. It means the proxy allocated to your session is not used by another session in your distributed pool. However, if no proxy fulfill requirement to be attached, proxy already used in another could be attached. That's why it's a "soft" anti affinity.

Scrapfly prevent from concurrent access on session, ERR::SESSION::CONCURRENT_ACCESS error will be returned by api

Example

You can use session_sticky_proxy=false to renew ip. Use it with caution, most of the time you should not using it. Since most hash cookie / bot detection are based on ip or location, it might not give a good result.
Interactive Example: API Session Example
openapi openapi API Session Example Sign in
            curl -X GET https://api.scrapfly.io/scrape?key=__API_KEY__&url=https%3A%2F%2Famazon.fr&session=test
        
HTTP Call Pretty Print
https://api.scrapfly.io/scrape?key=&url=https%3A%2F%2Famazon.fr&session=test

key
=
url
= https%3A%2F%2Famazon.fr
session
= test
openapi openapi Session Example | Json
            ...
"context": {
    ...
    "session": {
        "name": "test",
        "state": "FREE",
        "lease": null,
        "correlation_id": "default",
        "identity": "929f7fb918367df788717d825a3d75391e148c76",
        "created_at": "2020-09-15 12:44:15 UTC",
        "cookie_jar": [
            {
                "name": "session-id",
                "value": "260-1721863-1512555",
                "expires": "2021-09-15 12:44:49 UTC",
                "path": "/",
                "comment": null,
                "domain": ".amazon.fr",
                "max_age": null,
                "secure": false,
                "http_only": false,
                "version": null,
                "size": 29
            },
            {
                "name": "i18n-prefs",
                "value": "EUR",
                "expires": "2021-09-15 12:44:49 UTC",
                "path": "/",
                "comment": null,
                "domain": ".amazon.fr",
                "max_age": null,
                "secure": false,
                "http_only": false,
                "version": null,
                "size": 13
            },
            {
                "name": "csm-hit",
                "value": "tb:s-QZYPV6S7SE33B4BZ9C4Q|1600173886804&t:1600173887212&adb:adblk_no",
                "expires": "2021-08-31 12:44:47 UTC",
                "path": "/",
                "comment": null,
                "domain": "www.amazon.fr",
                "max_age": null,
                "secure": false,
                "http_only": false,
                "version": null,
                "size": 75
            },
            {
                "name": "ubid-acbfr",
                "value": "257-3489518-3158810",
                "expires": "2021-09-15 12:44:49 UTC",
                "path": "/",
                "comment": null,
                "domain": ".amazon.fr",
                "max_age": null,
                "secure": false,
                "http_only": false,
                "version": null,
                "size": 29
            },
            {
                "name": "session-id-time",
                "value": "2082787201l",
                "expires": "2021-09-15 12:44:49 UTC",
                "path": "/",
                "comment": null,
                "domain": ".amazon.fr",
                "max_age": null,
                "secure": false,
                "http_only": false,
                "version": null,
                "size": 26
            },
            {
                "name": "session-token",
                "value": "\"Eb83GFBgcWB9a5j+7gEBHw8+vA9lxXwdxRWR6DYBWuX4pmLsPfwacoVoW2ChuDf6NiMHiPqZdxYfz9xb9VtLnTxYj7jZczNrNWbGi5PforjBW0TuWJ+iUsBl8k/C7NmcmPribBoO2CeP4AcYn5BTVPfbdfHHFfMKaSFLTT7egDSyITA2EyCWMn/rTvl/DgT0GRK+AI5DWRgCcZkEwjPGqMt9EFsE8lrOvOO9cs9gCLgqc5NnCGnTH4+WuOb4tERStnOSPN/Bdn0=\"",
                "expires": "2021-09-15 12:44:49 UTC",
                "path": "/",
                "comment": null,
                "domain": ".amazon.fr",
                "max_age": null,
                "secure": false,
                "http_only": false,
                "version": null,
                "size": 283
            },
            {
                "name": "ad-privacy",
                "value": "0",
                "expires": "2025-10-01 12:44:49 UTC",
                "path": "/",
                "comment": null,
                "domain": ".amazon-adsystem.com",
                "max_age": null,
                "secure": true,
                "http_only": true,
                "version": null,
                "size": 11
            }
        ],
        "last_used_at": "2020-09-15 12:44:50 UTC",
        "expire_at": "2020-09-22 12:44:50 UTC",
        "referer": "https://www.amazon.fr/"
    }
    ...
}
...

        

Integration