Javascript Rendering
Scrapfly's headless browser feature is the ultimate solution for web scraping needs that involve javascript-rendered content. With our cloud-based platform, each scrape runs on a dedicated browser instance that is optimized to respond quickly and reliably.
Our advanced cache resource is powered by a global private CDN for maximum efficiency, and our solution is designed to handle proxy peering with ease. Scraping javascript-rendered content is slower and requires more resources, including proxy loads and page rendering with javascript execution. The scraping time depends on factors such as the proxy location, website hosting distance, content size, number of resources, and amount of javascript.
When Javascript Rendering is enabled, we track multiple resources like:
- Intermediate HTTP queries (request/response)
- Local Storage
- Session Storage
- Screenshot (on demand)
- Remote javascript execution result (on demand)
- Websockets (Upgrade request and dataframe)
Rendering delay
You might need to wait before we extract the content of the page. We provide two ways to achieve that:
Scrapfly will wait 5s before extracting the content of the page. rendering_wait
parameter is expressed in
milliseconds. The maximum allowed time to wait is 25s.
var request = require('request');
var options = {
'method': 'GET',
'url': 'https://api.scrapfly.io/scrape?key=__API_KEY__&url=https%3A%2F%2Fhttpbin.dev%2Fanything&render_js=true&rendering_wait=5000'
};
request(options, function (error, response) {
if (error) throw new Error(error);
console.log(response.body);
});
- CSS Selector and Xpath are case sensitive
-
Character like
~
,:
,/
in CSS Selector need to be escaped with\\
Example:#selector:1234
become#selector\\:1234
Until .quote:not([style*="display:none"])
class is present and visible, we will wait. The selector watcher will timeout after 15s.
Selector are case-sensitive and need to be urlencoded.
var request = require('request');
var options = {
'method': 'GET',
'url': 'https://api.scrapfly.io/scrape?key=__API_KEY__&url=https%3A%2F%2Fquotes.toscrape.com%2Fjs%2F&render_js=true&wait_for_selector=.quote%3Anot%28%5Bstyle%2A%3D%22display%3Anone%22%5D%29'
};
request(options, function (error, response) {
if (error) throw new Error(error);
console.log(response.body);
});
A more robust solution is to use xpath, we will achieve the same as the previous example but in xpath
//*[contains(concat(" ",normalize-space(@class)," ")," quote ") and not(contains(@style,'display:none'))]
var request = require('request');
var options = {
'method': 'GET',
'url': 'https://api.scrapfly.io/scrape?key=__API_KEY__&url=https%3A%2F%2Fquotes.toscrape.com%2Fjs%2F&render_js=true&wait_for_selector=%2F%2F%2A%5Bcontains%28concat%28%22%20%22%2Cnormalize-space%28%40class%29%2C%22%20%22%29%2C%22%20quote%20%22%29%20and%20not%28contains%28%40style%2C%27display%3Anone%27%29%29%5D'
};
request(options, function (error, response) {
if (error) throw new Error(error);
console.log(response.body);
});
Related API errors :
Javascript Execution
We provide a way to inject your javascript to be executed on the web page.
You must base64 your script and every language support it.Your Javascript will be executed after the rendering delay and before the awaited selector (if defined).
You can return any serializable value to retrieve it through our API response under
response['result']['browser_data']['javascript_evaluation_result']
We will execute this script on to retrieve article titles
return Array.from(document.querySelectorAll('td.title > a')).map((el) => el.textContent)
We encode in base64 this script
cmV0dXJuIEFycmF5LmZyb20oZG9jdW1lbnQucXVlcnlTZWxlY3RvckFsbCgndGQudGl0bGUgPiBhJykpLm1hcCgoZWwpID0+IGVsLnRleHRDb250ZW50KQ==
Then call our API and provide your script like in this example
var request = require('request');
var options = {
'method': 'GET',
'url': 'https://api.scrapfly.io/scrape?key=__API_KEY__&url=https%3A%2F%2Fnews.ycombinator.com&render_js=true&js=cmV0dXJuIEFycmF5LmZyb20oZG9jdW1lbnQucXVlcnlTZWxlY3RvckFsbCgndGQudGl0bGUgPiBhJykpLm1hcCgoZWwpID0-IGVsLnRleHRDb250ZW50KQ'
};
request(options, function (error, response) {
if (error) throw new Error(error);
console.log(response.body);
});
"https://api.scrapfly.io/scrape?key=&url=https%3A%2F%2Fnews.ycombinator.com&render_js=true&js=return%20Array.from%28document.querySelectorAll%28%27td.title%20%3E%20a%27%29%29.map%28%28el%29%20%3D%3E%20el.textContent%29"
"api.scrapfly.io"
"/scrape"
key = ""
url = "https://news.ycombinator.com"
render_js = "true"
js = "return Array.from(document.querySelectorAll('td.title > a')).map((el) => el.textContent)"
Snippets
Scroll to the bottom of the page to fully render the HTML rendering on some website
window.scrollTo(0,document.body.scrollHeight);
Example of Result
The result under response['result']['browser_data']['javascript_evaluation_result']
contains what's the script returned to us :
[
"US Travel firm $4.5m ransom negotiation open chat",
"Laws of UX",
"Briar Project",
"Pleroma: A Mastodon-compatible open and federated social networking server",
"Mastodon 3.2",
"A philosophical difference between Haskell and Lisp",
"Show HN: High performance X11 animated wallpapers",
"When I raised my B2B SaaS\u2019s prices",
"Illustrated Self-Guided Course On How To Use The Slide Rule",
"Facebook hate-speech boycott had little effect on revenue",
"SpaceX Crew Dragon Splashes Down in the Gulf of Mexico",
"Why Can't We All Just Get Along? Uncertain Biological Basis of Morality (2013)",
"\u201cZombie cicadas\u201d infected with mind-controlling fungus return to West Virginia",
"What is a Product Roadmap?",
"Brain-Gut Circuit Lets Microbiota Directly Affect the Sympathetic Nervous System",
"How to Run Turing Machines on Encrypted Data [pdf]",
"A collection of books, talks, and papers on security engineering",
"Rethinking the Science of Skin",
"GITenberg is an open source community for publishing ebooks in the public domain",
"How real are real numbers? (2004)",
"Beyond Bitswap",
"What I Learned About Failing from My 5 Year Indie Game Dev Project",
"The Architecture of the Medieval Page (2018)",
"I Still Use an Old PowerPC Mac in 2020",
"Microsoft to continue discussions on potential TikTok purchase in the US",
"Lord and Taylor, Oldest U.S. Department Store, Files Bankruptcy",
"\u03bcPlot v1.1 \u2013 now with log scales support",
"OCaml for the Skeptical: OCaml in a Nutshell (2006)",
"GPU Accelerated JavaScript",
"Show HN: Create beautiful landing pages by copy-paste",
"More"
]
Screenshot
You are able to take many screenshots when you use a browser. There is a dedicated page to screenshot feature available here.
Resource Tracking
Intermediate requests from XHR or Fetch called from javascript are tracked. We provide you the request with headers, method, body, and the same for the response. You can also retrieve the content Local Storage
and Session Storage
. Those data are available in result section of our API response response['result']['browser_data']
...
"result": {
...,
"browser_data": {
"xhr_call": [
{
"url": "https://aan.amazon.fr/cem",
"headers": {
"user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36",
"content-type": "application/json",
"accept": "*/*",
"referer": "https://images-eu.ssl-images-amazon.com/images/G/08/ape/sf/whitelisted/desktop/sf-1.50.628cb61._V408130105_.html"
},
"method": "POST",
"body": "{\"render_id\":\"4a7152f0-cb58-4de8-b152-f0cb58cde8a2\",\"event_type\":\"impression\",\"dimensions\":{\"subtype\":\"impression\",\"value\":1,\"template_name\":\"Dynamic eCommerce - universal\"}}"
},
{
"url": "https://www.amazon.fr/gp/customer-reviews/aj/private/reviewsGallery/get-image-gallery-assets",
"headers": {
"rtt": "0",
"accept": "text/html,*/*",
"x-requested-with": "XMLHttpRequest",
"downlink": "10",
"ect": "4g",
"user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36",
"content-type": "application/x-www-form-urlencoded",
"referer": "https://www.amazon.fr/gp/product/B008AVQXDO?pf_rd_r=APG7NKFQ8DTBPK2TEN8R&pf_rd_p=70373c30-7461-4a24-bb1f-f3fde4f2df3a",
"cookie": "session-id=261-7851197-2783504; i18n-prefs=EUR; ubid-acbfr=262-5387700-5547500; session-id-time=2082754801l; x-wl-uid=145H5Y5j+m7oe7NpElaItmpA5YWGFqUy34ZvPnc+Yd8m+UIZC49+YTzyieSn/K4Kfq162NF1AbZo=; session-token=aLl1Sgktrzq+wYbYCVAKoXJA+3aIAhtP36mNtxkpZORbiSqd3ur/uaU6W1aHycEtUy4LpAJrcV2YmGqNHYb4trXCj3Wt4Vxc5W/aCaww5HctUNsijeRB2Dxp/ca1gtYdEEpTJGBprLlnrFg85RsOkfiWb9nysakwy54GjF9aOjksmN0ip3XCgDbO9uIZ7/X8lgM7pTDy7tTVBJtRvK79S/k9PbfDxEjXULIpNE8iYBdTvm95Xevgmgr1nouA1frzwUFYYzhCg1k=; csm-hit=tb:s-5B0K136YR4QK89MQ8RG0|1596420691120&t:1596420692684&adb:adblk_no"
},
"method": "POST",
"body": null
},
{
"url": "https://www.amazon.fr/gp/customer-reviews/aj/private/reviewsGallery/get-application-resources-for-reviews-gallery",
"headers": {
"rtt": "0",
"accept": "*/*",
"x-requested-with": "XMLHttpRequest",
"downlink": "10",
"ect": "4g",
"user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36",
"content-type": "application/x-www-form-urlencoded",
"referer": "https://www.amazon.fr/gp/product/B008AVQXDO?pf_rd_r=APG7NKFQ8DTBPK2TEN8R&pf_rd_p=70373c30-7461-4a24-bb1f-f3fde4f2df3a",
"cookie": "session-id=261-7851197-2783504; i18n-prefs=EUR; ubid-acbfr=262-5387700-5547500; session-id-time=2082754801l; x-wl-uid=145H5Y5j+m7oe7NpElaItmpA5YWGFqUy34ZvPnc+Yd8m+UIZC49+YTzyieSn/K4Kfq162NF1AbZo=; session-token=aLl1Sgktrzq+wYbYCVAKoXJA+3aIAhtP36mNtxkpZORbiSqd3ur/uaU6W1aHycEtUy4LpAJrcV2YmGqNHYb4trXCj3Wt4Vxc5W/aCaww5HctUNsijeRB2Dxp/ca1gtYdEEpTJGBprLlnrFg85RsOkfiWb9nysakwy54GjF9aOjksmN0ip3XCgDbO9uIZ7/X8lgM7pTDy7tTVBJtRvK79S/k9PbfDxEjXULIpNE8iYBdTvm95Xevgmgr1nouA1frzwUFYYzhCg1k=; csm-hit=tb:s-5B0K136YR4QK89MQ8RG0|1596420691120&t:1596420692684&adb:adblk_no"
},
"method": "POST",
"body": "noCache=1596420693002"
},
{
"url": "https://www.amazon.fr/gp/cerberus/gv",
"headers": {
"rtt": "0",
"user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36",
"content-type": "application/x-www-form-urlencoded",
"accept": "*/*",
"cache-control": "no-cache",
"x-requested-with": "XMLHttpRequest",
"downlink": "10",
"ect": "4g",
"referer": "https://www.amazon.fr/gp/product/B008AVQXDO?pf_rd_r=APG7NKFQ8DTBPK2TEN8R&pf_rd_p=70373c30-7461-4a24-bb1f-f3fde4f2df3a",
"cookie": "session-id=261-7851197-2783504; i18n-prefs=EUR; ubid-acbfr=262-5387700-5547500; session-id-time=2082754801l; x-wl-uid=145H5Y5j+m7oe7NpElaItmpA5YWGFqUy34ZvPnc+Yd8m+UIZC49+YTzyieSn/K4Kfq162NF1AbZo=; session-token=aLl1Sgktrzq+wYbYCVAKoXJA+3aIAhtP36mNtxkpZORbiSqd3ur/uaU6W1aHycEtUy4LpAJrcV2YmGqNHYb4trXCj3Wt4Vxc5W/aCaww5HctUNsijeRB2Dxp/ca1gtYdEEpTJGBprLlnrFg85RsOkfiWb9nysakwy54GjF9aOjksmN0ip3XCgDbO9uIZ7/X8lgM7pTDy7tTVBJtRvK79S/k9PbfDxEjXULIpNE8iYBdTvm95Xevgmgr1nouA1frzwUFYYzhCg1k=; csm-hit=tb:s-5B0K136YR4QK89MQ8RG0|1596420691120&t:1596420692684&adb:adblk_no"
},
"method": "POST",
"body": "payload=%7B%22producerId%22%3A%22detail-page%22%2C%22asin%22%3A%22B008AVQXDO%22%2C%22asin_price%22%3A%229.49%22%2C%22asin_shipping_price%22%3A%220%22%2C%22asin_currency_code%22%3A%22EUR%22%2C%22device_type%22%3A%22WEB%22%2C%22display_code%22%3A%22Asin+is+not+eligible+because+it+has+a+retail+offer%22%2C%22substitute_count%22%3A%22-1%22%7D"
}
],
"local_storage_data": {
"csm-hit": "tb:s-5B0K136YR4QK89MQ8RG0|1596420691120&t:1596420692684&adb:adblk_no",
"csm:adb": "adblk_no",
"csm-bf": "[\"5B0K136YR4QK89MQ8RG0\"]",
"a-font-class": "a-ember"
},
"session_storage_data": {
"csm-hit": "tb:s-5B0K136YR4QK89MQ8RG0|1596420691120&t:1596420692684&adb:adblk_no",
"csm:adb": "adblk_no",
"csm-bf": "[\"5B0K136YR4QK89MQ8RG0\"]",
"a-font-class": "a-ember"
},
"websockets": [],
"javascript_evaluation_result": null
},
...
}
...
Limitations
Javascript rendering feature is only available with GET
method. You can't use a browser to send POST, PATCH, PUT, HEAD
requests.
Following XHR / Fetched resources are not tracked:
- Fonts:
.woff
,.woff2
,.otf
,.ttf
- Media:
.webm
,.oga
,.aac
,.m4a
,.mp3
,.wav
,.mp4
- Image:
.svg
,.png
,.gif
,.jpg
,.jpeg
,.ico
- Style:
.css
- Other:
.pbf
You can retrieve an emitted XHR call with associate URL, headers, body, and method. We do not attach the response. If you need the response content, you can simply directly call the XHR URL.
It's not possible to directly download the media/image content with a browser, it will load the image url in html document and img tag to display it. Without the browser you will retrieve the base64 of the binary content.
...
"result": {
...,
"browser_data": {
"xhr_call": [
{
"url": "https://aan.amazon.fr/cem",
"headers": {
"user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36",
"content-type": "application/json",
"accept": "*/*",
"referer": "https://images-eu.ssl-images-amazon.com/images/G/08/ape/sf/whitelisted/desktop/sf-1.50.628cb61._V408130105_.html"
},
"method": "POST",
"body": "{\"render_id\":\"4a7152f0-cb58-4de8-b152-f0cb58cde8a2\",\"event_type\":\"impression\",\"dimensions\":{\"subtype\":\"impression\",\"value\":1,\"template_name\":\"Dynamic eCommerce - universal\"}}"
},
{
"url": "https://www.amazon.fr/gp/customer-reviews/aj/private/reviewsGallery/get-image-gallery-assets",
"headers": {
"rtt": "0",
"accept": "text/html,*/*",
"x-requested-with": "XMLHttpRequest",
"downlink": "10",
"ect": "4g",
"user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36",
"content-type": "application/x-www-form-urlencoded",
"referer": "https://www.amazon.fr/gp/product/B008AVQXDO?pf_rd_r=APG7NKFQ8DTBPK2TEN8R&pf_rd_p=70373c30-7461-4a24-bb1f-f3fde4f2df3a",
"cookie": "session-id=261-7851197-2783504; i18n-prefs=EUR; ubid-acbfr=262-5387700-5547500; session-id-time=2082754801l; x-wl-uid=145H5Y5j+m7oe7NpElaItmpA5YWGFqUy34ZvPnc+Yd8m+UIZC49+YTzyieSn/K4Kfq162NF1AbZo=; session-token=aLl1Sgktrzq+wYbYCVAKoXJA+3aIAhtP36mNtxkpZORbiSqd3ur/uaU6W1aHycEtUy4LpAJrcV2YmGqNHYb4trXCj3Wt4Vxc5W/aCaww5HctUNsijeRB2Dxp/ca1gtYdEEpTJGBprLlnrFg85RsOkfiWb9nysakwy54GjF9aOjksmN0ip3XCgDbO9uIZ7/X8lgM7pTDy7tTVBJtRvK79S/k9PbfDxEjXULIpNE8iYBdTvm95Xevgmgr1nouA1frzwUFYYzhCg1k=; csm-hit=tb:s-5B0K136YR4QK89MQ8RG0|1596420691120&t:1596420692684&adb:adblk_no"
},
"method": "POST",
"body": null
},
{
"url": "https://www.amazon.fr/gp/customer-reviews/aj/private/reviewsGallery/get-application-resources-for-reviews-gallery",
"headers": {
"rtt": "0",
"accept": "*/*",
"x-requested-with": "XMLHttpRequest",
"downlink": "10",
"ect": "4g",
"user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36",
"content-type": "application/x-www-form-urlencoded",
"referer": "https://www.amazon.fr/gp/product/B008AVQXDO?pf_rd_r=APG7NKFQ8DTBPK2TEN8R&pf_rd_p=70373c30-7461-4a24-bb1f-f3fde4f2df3a",
"cookie": "session-id=261-7851197-2783504; i18n-prefs=EUR; ubid-acbfr=262-5387700-5547500; session-id-time=2082754801l; x-wl-uid=145H5Y5j+m7oe7NpElaItmpA5YWGFqUy34ZvPnc+Yd8m+UIZC49+YTzyieSn/K4Kfq162NF1AbZo=; session-token=aLl1Sgktrzq+wYbYCVAKoXJA+3aIAhtP36mNtxkpZORbiSqd3ur/uaU6W1aHycEtUy4LpAJrcV2YmGqNHYb4trXCj3Wt4Vxc5W/aCaww5HctUNsijeRB2Dxp/ca1gtYdEEpTJGBprLlnrFg85RsOkfiWb9nysakwy54GjF9aOjksmN0ip3XCgDbO9uIZ7/X8lgM7pTDy7tTVBJtRvK79S/k9PbfDxEjXULIpNE8iYBdTvm95Xevgmgr1nouA1frzwUFYYzhCg1k=; csm-hit=tb:s-5B0K136YR4QK89MQ8RG0|1596420691120&t:1596420692684&adb:adblk_no"
},
"method": "POST",
"body": "noCache=1596420693002"
},
{
"url": "https://www.amazon.fr/gp/cerberus/gv",
"headers": {
"rtt": "0",
"user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36",
"content-type": "application/x-www-form-urlencoded",
"accept": "*/*",
"cache-control": "no-cache",
"x-requested-with": "XMLHttpRequest",
"downlink": "10",
"ect": "4g",
"referer": "https://www.amazon.fr/gp/product/B008AVQXDO?pf_rd_r=APG7NKFQ8DTBPK2TEN8R&pf_rd_p=70373c30-7461-4a24-bb1f-f3fde4f2df3a",
"cookie": "session-id=261-7851197-2783504; i18n-prefs=EUR; ubid-acbfr=262-5387700-5547500; session-id-time=2082754801l; x-wl-uid=145H5Y5j+m7oe7NpElaItmpA5YWGFqUy34ZvPnc+Yd8m+UIZC49+YTzyieSn/K4Kfq162NF1AbZo=; session-token=aLl1Sgktrzq+wYbYCVAKoXJA+3aIAhtP36mNtxkpZORbiSqd3ur/uaU6W1aHycEtUy4LpAJrcV2YmGqNHYb4trXCj3Wt4Vxc5W/aCaww5HctUNsijeRB2Dxp/ca1gtYdEEpTJGBprLlnrFg85RsOkfiWb9nysakwy54GjF9aOjksmN0ip3XCgDbO9uIZ7/X8lgM7pTDy7tTVBJtRvK79S/k9PbfDxEjXULIpNE8iYBdTvm95Xevgmgr1nouA1frzwUFYYzhCg1k=; csm-hit=tb:s-5B0K136YR4QK89MQ8RG0|1596420691120&t:1596420692684&adb:adblk_no"
},
"method": "POST",
"body": "payload=%7B%22producerId%22%3A%22detail-page%22%2C%22asin%22%3A%22B008AVQXDO%22%2C%22asin_price%22%3A%229.49%22%2C%22asin_shipping_price%22%3A%220%22%2C%22asin_currency_code%22%3A%22EUR%22%2C%22device_type%22%3A%22WEB%22%2C%22display_code%22%3A%22Asin+is+not+eligible+because+it+has+a+retail+offer%22%2C%22substitute_count%22%3A%22-1%22%7D"
}
],
"local_storage_data": {
"csm-hit": "tb:s-5B0K136YR4QK89MQ8RG0|1596420691120&t:1596420692684&adb:adblk_no",
"csm:adb": "adblk_no",
"csm-bf": "[\"5B0K136YR4QK89MQ8RG0\"]",
"a-font-class": "a-ember"
},
"session_storage_data": {
"csm-hit": "tb:s-5B0K136YR4QK89MQ8RG0|1596420691120&t:1596420692684&adb:adblk_no",
"csm:adb": "adblk_no",
"csm-bf": "[\"5B0K136YR4QK89MQ8RG0\"]",
"a-font-class": "a-ember"
},
"websockets": [],
"javascript_evaluation_result": null
},
...
}
...
Related Errors
All related errors are listed below. You can see the full description and examples of errors response on Errors section
- 422 - ERR::SCRAPE::DRIVER_CRASHED
- 422 - ERR::SCRAPE::DRIVER_TIMEOUT
- 422 - ERR::SCRAPE::JAVASCRIPT_EXECUTION
- 422 - ERR::SCRAPE::NO_BROWSER_AVAILABLE
Pricing
Using JavaScript rendering will cost 5 Scrape API Credits against your quota. Keep in mind JavaScript Rendering is slow and uses many data/resources. For maximum performance, you should avoid it when it's not required.
API Response contains header X-Scrapfly-Api-Cost
indicate you the billed amount.