Javascript Scenario BETA
Usage
This feature require Javascript Rendering enabled
Allow your app to interact with the website and chain multiple action. A scenario is a sequence of one or multiple action. A scenario have a budget of 25s to be executed. If the estimation of the "worst" case (awaiting maximum of timeout of actions) is more than 25s, the scenario will be rejected.
Javascript scenario must be base64 encoded with url safe option
For long-running javascript scenario requiring more than 25sec - You can check the how timeout works
TL;DRretry=false
timeout after90s
by default and you can customize the timeout withretry=false&timeout=120000
What a scenario looks like
[
{"fill": {"selector": "#username", "value":"demo"}},
{"fill": {"selector": "#password", "value":"demo"}},
{"click": {"selector": "form input[type='submit']"}},
{"wait_for_navigation": {"timeout": 5000}}
]
Full example with API Player
curl -G \
--request "GET" \
--url "https://api.scrapfly.io/scrape" \
--data-urlencode "key=__API_KEY__" \
--data-urlencode "url=https://quotes.toscrape.com/login" \
--data-urlencode "tags=player,project:default" \
--data-urlencode "render_js=true" \
--data-urlencode "screenshots[test]=fullpage" \
--data-urlencode "js_scenario=W3siZmlsbCI6eyJzZWxlY3RvciI6IiN1c2VybmFtZSIsInZhbHVlIjoiZGVtbyJ9fSx7ImZpbGwiOnsic2VsZWN0b3IiOiIjcGFzc3dvcmQiLCJ2YWx1ZSI6ImRlbW8ifX0seyJjbGljayI6eyJzZWxlY3RvciI6ImZvcm0gaW5wdXRbdHlwZT0nc3VibWl0J10ifX0seyJ3YWl0X2Zvcl9uYXZpZ2F0aW9uIjpbXX1d"
"https://api.scrapfly.io/scrape?key=&url=https%3A%2F%2Fquotes.toscrape.com%2Flogin&tags=player%2Cproject%3Adefault&render_js=true&screenshots[test]=fullpage&js_scenario=W3siZmlsbCI6eyJzZWxlY3RvciI6IiN1c2VybmFtZSIsInZhbHVlIjoiZGVtbyJ9fSx7ImZpbGwiOnsic2VsZWN0b3IiOiIjcGFzc3dvcmQiLCJ2YWx1ZSI6ImRlbW8ifX0seyJjbGljayI6eyJzZWxlY3RvciI6ImZvcm0gaW5wdXRbdHlwZT0nc3VibWl0J10ifX0seyJ3YWl0X2Zvcl9uYXZpZ2F0aW9uIjpbXX1d"
"api.scrapfly.io"
"/scrape"
key = ""
url = "https://quotes.toscrape.com/login"
tags = "player,project:default"
render_js = "true"
screenshots[test] = "fullpage"
js_scenario = "W3siZmlsbCI6eyJzZWxlY3RvciI6IiN1c2VybmFtZSIsInZhbHVlIjoiZGVtbyJ9fSx7ImZpbGwiOnsic2VsZWN0b3IiOiIjcGFzc3dvcmQiLCJ2YWx1ZSI6ImRlbW8ifX0seyJjbGljayI6eyJzZWxlY3RvciI6ImZvcm0gaW5wdXRbdHlwZT0nc3VibWl0J10ifX0seyJ3YWl0X2Zvcl9uYXZpZ2F0aW9uIjpbXX1d"
Example of response with scenario
...
"result": {
...,
"browser_data": {
"xhr_call": [...],
"local_storage_data": {
"csm-hit": "tb:s-5B0K136YR4QK89MQ8RG0|1596420691120&t:1596420692684&adb:adblk_no",
"csm:adb": "adblk_no",
"csm-bf": "[\"5B0K136YR4QK89MQ8RG0\"]",
"a-font-class": "a-ember"
},
"session_storage_data": {
"csm-hit": "tb:s-5B0K136YR4QK89MQ8RG0|1596420691120&t:1596420692684&adb:adblk_no",
"csm:adb": "adblk_no",
"csm-bf": "[\"5B0K136YR4QK89MQ8RG0\"]",
"a-font-class": "a-ember"
},
"websockets": [],
"javascript_evaluation_result": null,
"js_scenario": {
"duration": 4.92,
"executed": 5,
"steps": [
{
"action": "fill",
"config": {
"selector": "#username",
"value": "demo"
},
"duration": 1.11,
"executed": true,
"result": null,
"success": true
},
{
"action": "fill",
"config": {
"selector": "#password",
"value": "demo"
},
"duration": 0.47,
"executed": true,
"result": null,
"success": true
},
{
"action": "click",
"config": {
"ignore_if_not_visible": false,
"selector": "form input[type='submit']"
},
"duration": 0.52,
"executed": true,
"result": null,
"success": true
},
{
"action": "wait_for_navigation",
"config": {
"expect_url": null,
"timeout": 5000
},
"duration": 2.81,
"executed": true,
"result": null,
"success": true
},
{
"action": "execute",
"config": "return document.location.toString()",
"duration": 0.01,
"executed": true,
"result": "http://quotes.toscrape.com/",
"success": true
}
]
},
},
...
}
...
Params Reference
- [MANDATORY] param_name:type
- [OPTIONAL] param_name:type
Click
selector:string ignore_if_not_visible:bool=false timeout budget (ms): +2500Click on a visible element. It's a native click and emit a trusted event - it's not simulated with javascript.
Internal Workflow
- Waiting element to be visible
- Moving to the element (mouse and scroll) like a human
- Trigger the focus of the element
- Left click
Parameters
- selector:string Accept CSS Selector and XPATH Selector
- ignore_if_not_visible:bool Wait the element if visible then just skip if not
Usage
{"click": {"selector": ".cookie-gdpr-consent", "ignore_if_not_visible": true)}}
{"click": {"selector": "submit.btn"}}
Fill
selector:string value:string timeout budget (ms): +${timeout} +500Type the provided value in the targeted element. The typing is not simulated in javascript - it's from real keyboard input.
Internal Workflow
- Waiting element to be visible
- Moving to the element (mouse and scroll) like a human
- Trigger the focus of the element
- Type the value in the input like a human
Parameters
- selector:string Accept CSS Selector and XPATH Selector
- value:string Value to type in element
- clear:boolean Clear the input field before writing
Usage
{"fill": {"selector": "#name", "value": "John Do")}}
Condition
status_code:intPlay the scenario only if the condition is met
Internal Workflow
- Check the equality of the status code with the response status code
Parameters
- status_code:int Any integer
Usage
{"condition": {"status_code": 200}}
Wait
timeout budget (ms): +${wait}Make pause during the scenario. The whole pause time is added to the scenario budget
Parameters
There is no parameter, you pass directly the value expressed in millisecond
Usage
{"wait": 2000}}
Scroll
element:string=body selector:string=bottom timeout budget (ms): +500Scroll to the selector (if no selector, scroll to the bottom). If the element parameter is a valid selector, it's scroll within the element. The scroll is not simulated with javascript - it's simulated with real mouse input.
Internal Workflow
- Wait the element is visible
- Wait the selector is visible
- Scroll like a human
Parameters
- element:string=body a valid css selector or xpath or "body"
- selector:string a valid css selector or xpath or "bottom"
- infinite:int=0 infinite scroll - number of scroll iteration
Usage
{"scroll": {"selector": "bottom"}}
{"scroll": {"selector": "#pricing"}}
{"scroll": {"element": "#scrollable-list", "selector": "bottom", "infinite": 2}}
Execute
timeout:int=3000 timeout budget (ms): +${timeout}Execute a javascript script and store the result if a result is returned
Internal Workflow
- The Javascript code is executed
-
If the javascript code return something - it's stored and available in API response
result.browser_data.js_scenario.steps
, all "execute" step have aresult
entry. - Support Async/Await function
Parameters
- script:string Script to execute, it can return serializable value
- timeout:int Timeout to wait after the script execution have started - expressed in millisecond
Usage
{"execute": {"script": "document.querySelector(\"body\").style.backgroundColor = \"red\";}"}
{"execute": {"script": "return navigator.userAgent", "timeout": "1000"}
Wait For Navigation
timeout:int=1000 timeout budget (ms): +${timeout} + 1500Time to wait to detect a navigation / changing page. The given timeout + 1500 (1.5s) is added to the scenario budget - this additional time represent the average duration of a standard page loading (with assets, xhr, etc). For example if you set a timeout of 1000, 2500 is counted.
Parameters
- timeout:int Maximum timeout to wait for a navigation - expressed in millisecond
Usage
{"wait_for_navigation": {}}
{"wait_for_navigation": {"timeout": 5000}}
Wait For Selector
selector:string=body state:string=visible timeout budget (ms): +${timeout}Wait the element is visible (if state=visible) in the page or the element disappear (state=hidden). If the selector is not present in the desired state until the timeout this step failed and the scenario is aborted. The timeout is added to the scenario budget
Parameters
- selector:string=body a valid css selector or xpath or "body"
- state:string=visible state of the element in the page "visible" or "hidden"
- timeout:int=5000 Timeout to wait before fail - expressed in milliseconds
Usage
{"wait_for_selector": {"selector": "#pricing"}}
{"wait_for_selector": {"selector": "#loading", "state": "hidden", "timeout": 10000}}