Javascript Scenario BETA

Usage

This feature require Javascript Rendering enabled

Allow your app to interact with the website and chain multiple action. A scenario is a sequence of one or multiple action. A scenario have a budget of 25s to be executed. If the estimation of the "worst" case (awaiting maximum of timeout of actions) is more than 25s, the scenario will be rejected.

Javascript scenario must be base64 encoded with url safe option

For long-running javascript scenario requiring more than 25sec - You can check the how timeout works
TL;DR retry=false timeout after 90s by default and you can customize the timeout with retry=false&timeout=120000

What a scenario looks like

[
    {"fill": {"selector": "#username", "value":"demo"}},
    {"fill": {"selector": "#password", "value":"demo"}},
    {"click": {"selector": "form input[type='submit']"}},
    {"wait_for_navigation": {"timeout": 5000}}
]

Full example with API Player

import requests

url = "https://api.scrapfly.io/scrape?key=__API_KEY__&url=https%3A%2F%2Fquotes.toscrape.com%2Flogin&tags=player%2Cproject%3Adefault&render_js=true&screenshots[test]=fullpage&js_scenario=W3siZmlsbCI6eyJzZWxlY3RvciI6IiN1c2VybmFtZSIsInZhbHVlIjoiZGVtbyJ9fSx7ImZpbGwiOnsic2VsZWN0b3IiOiIjcGFzc3dvcmQiLCJ2YWx1ZSI6ImRlbW8ifX0seyJjbGljayI6eyJzZWxlY3RvciI6ImZvcm0gaW5wdXRbdHlwZT0nc3VibWl0J10ifX0seyJ3YWl0X2Zvcl9uYXZpZ2F0aW9uIjpbXX1d"

response = requests.request("GET", url)

print(response.text)

# import json
# print(json.loads(response.text)['result']['content'])
# print(json.loads(response.text)['result']['status_code'])
"https://api.scrapfly.io/scrape?key=&url=https%3A%2F%2Fquotes.toscrape.com%2Flogin&tags=player%2Cproject%3Adefault&render_js=true&screenshots[test]=fullpage&js_scenario=W3siZmlsbCI6eyJzZWxlY3RvciI6IiN1c2VybmFtZSIsInZhbHVlIjoiZGVtbyJ9fSx7ImZpbGwiOnsic2VsZWN0b3IiOiIjcGFzc3dvcmQiLCJ2YWx1ZSI6ImRlbW8ifX0seyJjbGljayI6eyJzZWxlY3RvciI6ImZvcm0gaW5wdXRbdHlwZT0nc3VibWl0J10ifX0seyJ3YWl0X2Zvcl9uYXZpZ2F0aW9uIjpbXX1d"

"api.scrapfly.io"
"/scrape"

key                = "" 
url                = "https://quotes.toscrape.com/login" 
tags               = "player,project:default" 
render_js          = "true" 
screenshots[test]  = "fullpage" 
js_scenario        = "W3siZmlsbCI6eyJzZWxlY3RvciI6IiN1c2VybmFtZSIsInZhbHVlIjoiZGVtbyJ9fSx7ImZpbGwiOnsic2VsZWN0b3IiOiIjcGFzc3dvcmQiLCJ2YWx1ZSI6ImRlbW8ifX0seyJjbGljayI6eyJzZWxlY3RvciI6ImZvcm0gaW5wdXRbdHlwZT0nc3VibWl0J10ifX0seyJ3YWl0X2Zvcl9uYXZpZ2F0aW9uIjpbXX1d" 

Example of response with scenario

...
"result": {
    ...,
    "browser_data": {
        "xhr_call": [...],
        "local_storage_data": {
            "csm-hit": "tb:s-5B0K136YR4QK89MQ8RG0|1596420691120&t:1596420692684&adb:adblk_no",
            "csm:adb": "adblk_no",
            "csm-bf": "[\"5B0K136YR4QK89MQ8RG0\"]",
            "a-font-class": "a-ember"
        },
        "session_storage_data": {
            "csm-hit": "tb:s-5B0K136YR4QK89MQ8RG0|1596420691120&t:1596420692684&adb:adblk_no",
            "csm:adb": "adblk_no",
            "csm-bf": "[\"5B0K136YR4QK89MQ8RG0\"]",
            "a-font-class": "a-ember"
        },
        "websockets": [],
        "javascript_evaluation_result": null,
        "js_scenario": {
            "duration": 4.92,
            "executed": 5,
            "steps": [
                {
                    "action": "fill",
                    "config": {
                        "selector": "#username",
                        "value": "demo"
                    },
                    "duration": 1.11,
                    "executed": true,
                    "result": null,
                    "success": true
                },
                {
                    "action": "fill",
                    "config": {
                        "selector": "#password",
                        "value": "demo"
                    },
                    "duration": 0.47,
                    "executed": true,
                    "result": null,
                    "success": true
                },
                {
                    "action": "click",
                    "config": {
                        "ignore_if_not_visible": false,
                        "selector": "form input[type='submit']"
                    },
                    "duration": 0.52,
                    "executed": true,
                    "result": null,
                    "success": true
                },
                {
                    "action": "wait_for_navigation",
                    "config": {
                        "expect_url": null,
                        "timeout": 5000
                    },
                    "duration": 2.81,
                    "executed": true,
                    "result": null,
                    "success": true
                },
                {
                    "action": "execute",
                    "config": "return document.location.toString()",
                    "duration": 0.01,
                    "executed": true,
                    "result": "http://quotes.toscrape.com/",
                    "success": true
                }
            ]
        },
    },
    ...
}
...

Params Reference

  • [MANDATORY] param_name:type
  • [OPTIONAL] param_name:type

Click

selector:string ignore_if_not_visible:bool=false timeout budget (ms): +2500

Click on a visible element. It's a native click and emit a trusted event - it's not simulated with javascript.

Internal Workflow

  • Waiting element to be visible
  • Moving to the element (mouse and scroll) like a human
  • Trigger the focus of the element
  • Left click

Parameters

  • selector:string Accept CSS Selector and XPATH Selector
  • ignore_if_not_visible:bool Wait the element if visible then just skip if not

Usage

{"click": {"selector": ".cookie-gdpr-consent", "ignore_if_not_visible": true)}}
{"click": {"selector": "submit.btn"}}

Fill

selector:string value:string timeout budget (ms): +${timeout} +500

Type the provided value in the targeted element. The typing is not simulated in javascript - it's from real keyboard input.

Internal Workflow

  • Waiting element to be visible
  • Moving to the element (mouse and scroll) like a human
  • Trigger the focus of the element
  • Type the value in the input like a human

Parameters

  • selector:string Accept CSS Selector and XPATH Selector
  • value:string Value to type in element
  • clear:boolean Clear the input field before writing

Usage

{"fill": {"selector": "#name", "value": "John Do")}}

Condition

status_code:int

Play the scenario only if the condition is met

Internal Workflow

  • Check the equality of the status code with the response status code

Parameters

  • status_code:int Any integer

Usage

{"condition": {"status_code": 200}}

Wait

timeout budget (ms): +${wait}

Make pause during the scenario. The whole pause time is added to the scenario budget

Parameters

There is no parameter, you pass directly the value expressed in millisecond

Usage

{"wait": 2000}}

Scroll

element:string=body selector:string=bottom timeout budget (ms): +500

Scroll to the selector (if no selector, scroll to the bottom). If the element parameter is a valid selector, it's scroll within the element. The scroll is not simulated with javascript - it's simulated with real mouse input.

Internal Workflow

  • Wait the element is visible
  • Wait the selector is visible
  • Scroll like a human

Parameters

  • element:string=body a valid css selector or xpath or "body"
  • selector:string a valid css selector or xpath or "bottom"
  • infinite:int=0 infinite scroll - number of scroll iteration

Usage

{"scroll": {"selector": "bottom"}}
{"scroll": {"selector": "#pricing"}}
{"scroll": {"element": "#scrollable-list", "selector": "bottom", "infinite": 2}}

Execute

timeout:int=3000 timeout budget (ms): +${timeout}

Execute a javascript script and store the result if a result is returned

Internal Workflow

  • The Javascript code is executed
  • If the javascript code return something - it's stored and available in API response result.browser_data.js_scenario.steps, all "execute" step have a result entry.
  • Support Async/Await function

Parameters

  • script:string Script to execute, it can return serializable value
  • timeout:int Timeout to wait after the script execution have started - expressed in millisecond

Usage

{"execute": {"script": "document.querySelector(\"body\").style.backgroundColor = \"red\";}"}
{"execute": {"script": "return navigator.userAgent", "timeout": "1000"}

Wait For Navigation

timeout:int=1000 timeout budget (ms): +${timeout} + 1500

Time to wait to detect a navigation / changing page. The given timeout + 1500 (1.5s) is added to the scenario budget - this additional time represent the average duration of a standard page loading (with assets, xhr, etc). For example if you set a timeout of 1000, 2500 is counted.

Parameters

  • timeout:int Maximum timeout to wait for a navigation - expressed in millisecond

Usage

{"wait_for_navigation": {}}
{"wait_for_navigation": {"timeout": 5000}}

Wait For Selector

selector:string=body state:string=visible timeout budget (ms): +${timeout}

Wait the element is visible (if state=visible) in the page or the element disappear (state=hidden). If the selector is not present in the desired state until the timeout this step failed and the scenario is aborted. The timeout is added to the scenario budget

Parameters

  • selector:string=body a valid css selector or xpath or "body"
  • state:string=visible state of the element in the page "visible" or "hidden"
  • timeout:int=5000 Timeout to wait before fail - expressed in milliseconds

Usage

{"wait_for_selector": {"selector": "#pricing"}}
{"wait_for_selector": {"selector": "#loading", "state": "hidden", "timeout": 10000}}