Javascript Rendering

Some websites render the content through javascript with intermediate calls called XHR - XML HTTP Request. To gather those data and render the page correctly in the same state as you can see it in your browser, you need a browser to render it correctly. Managing headless browsers at scale is complex, and time-consuming, need a lot of resource and experience to reduce the loading time, waste fewer resources as possible, and reduce bandwidth through proxies. If you already tried this exercise, you certainly remember that long night. We handle thousands of browsers with our cluster and auto-scale on-demand to provide all resources that you need; our browsers are optimized optimized to respond as fast as possible and reliable (no freeze, proxy hanging, etc.)

Our browser used advanced technologies to be unique and undetectable, they looks like a real browser to defeat most advanced anti bot vendors. Beside that we also provide an Anti-Scraping Protection (ASP) o go further at scale or pass whatever the settings provided. You don't have to think if you will be detected or not. You will not. Your scrapes are isolated per browser; one browser instance = one scrape, we dedicate a whole browser instance to your scrape and we do not reuse browser after that - we destroy the instance and recreate it from scratch.

When Javascript Rendering is enabled, we track multiple resources like:

  • Intermediate HTTP queries (request/response)
  • Local Storage
  • Session Storage
  • Screenshot (on demand)
  • Remote javascript execution result (on demand)
  • Websockets (Upgrade request and dataframe)
Scrape, which requires javascript rendering, is slower than usual. A way more resources are loaded through proxies, and the pages need to be rendered and javascript executed.

Browser is backed with an advanced cache solution and hop by hop peering solution to be fast as possible, but still not fast as a simple HTTP call.

Required time to scrape is mostly based on the distance between the proxy location and website hosting, the size of content, the number of resources, and the amount of javascript to execute.

Rendering delay

You might need to wait before we extract the content of the page. We provide two ways to achieve that:

Scrapfly will wait 5s before extracting the content of the page. rendering_wait parameter is expressed in milliseconds. The maximum allowed time to wait is 25s.

curl -G \
--request "GET" \
--url "https://api.scrapfly.io/scrape" \
--data-urlencode "key=__API_KEY__" \
--data-urlencode "url=https://httpbin.dev/anything" \
--data-urlencode "proxy_pool=public_datacenter" \
--data-urlencode "render_js=true" \
--data-urlencode "rendering_wait=5000"
  • CSS Selector and Xpath are case sensitive
  • Character like ~, :, / in CSS Selector need to be escaped with \\ Example: #selector:1234 become #selector\\:1234

Until .quote:not([style*="display:none"]) class is present and visible, we will wait. The selector watcher will timeout after 15s. Selector are case-sensitive and need to be urlencoded.

curl -G \
--request "GET" \
--url "https://api.scrapfly.io/scrape" \
--data-urlencode "key=__API_KEY__" \
--data-urlencode "url=https://quotes.toscrape.com/js/" \
--data-urlencode "render_js=true" \
--data-urlencode "wait_for_selector=.quote:not([style*=\"display:none\"])"

A more robust solution is to use xpath, we will achieve the same as the previous example but in xpath //*[contains(concat(" ",normalize-space(@class)," ")," quote ") and not(contains(@style,'display:none'))]

curl -G \
--request "GET" \
--url "https://api.scrapfly.io/scrape" \
--data-urlencode "key=__API_KEY__" \
--data-urlencode "url=https://quotes.toscrape.com/js/" \
--data-urlencode "render_js=true" \
--data-urlencode "wait_for_selector=//*[contains(concat(\" \",normalize-space(@class),\" \"),\" quote \") and not(contains(@style,'display:none'))]"
Related API errors :

Javascript Execution

We provide a way to inject your javascript to be executed on the web page.

You must base64 your script and every language support it.Your Javascript will be executed after the rendering delay and before the awaited selector (if defined).

You can return any serializable value to retrieve it through our API response under response['result']['browser_data']['javascript_evaluation_result']

We will execute this script on to retrieve article titles

return Array.from(document.querySelectorAll('td.title > a')).map((el) => el.textContent)

We encode in base64 this script

cmV0dXJuIEFycmF5LmZyb20oZG9jdW1lbnQucXVlcnlTZWxlY3RvckFsbCgndGQudGl0bGUgPiBhJykpLm1hcCgoZWwpID0+IGVsLnRleHRDb250ZW50KQ==

Then call our API and provide your script like in this example

curl -G \
--request "GET" \
--url "https://api.scrapfly.io/scrape" \
--data-urlencode "key=__API_KEY__" \
--data-urlencode "url=https://news.ycombinator.com" \
--data-urlencode "render_js=true" \
--data-urlencode "js=cmV0dXJuIEFycmF5LmZyb20oZG9jdW1lbnQucXVlcnlTZWxlY3RvckFsbCgndGQudGl0bGUgPiBhJykpLm1hcCgoZWwpID0+IGVsLnRleHRDb250ZW50KQ=="
"https://api.scrapfly.io/scrape?key=&url=https%3A%2F%2Fnews.ycombinator.com&render_js=true&js=return%20Array.from%28document.querySelectorAll%28%27td.title%20%3E%20a%27%29%29.map%28%28el%29%20%3D%3E%20el.textContent%29"

"api.scrapfly.io"
"/scrape"

key        = "" 
url        = "https://news.ycombinator.com" 
render_js  = "true" 
js         = "return Array.from(document.querySelectorAll('td.title > a')).map((el) => el.textContent)" 

Snippets

Scroll to the bottom of the page to fully render the HTML rendering on some website

window.scrollTo(0,document.body.scrollHeight);

Example of Result

The result under response['result']['browser_data']['javascript_evaluation_result'] contains what's the script returned to us :

[
    "US Travel firm $4.5m ransom negotiation open chat",
    "Laws of UX",
    "Briar Project",
    "Pleroma: A Mastodon-compatible open and federated social networking server",
    "Mastodon 3.2",
    "A philosophical difference between Haskell and Lisp",
    "Show HN: High performance X11 animated wallpapers",
    "When I raised my B2B SaaS\u2019s prices",
    "Illustrated Self-Guided Course On How To Use The Slide Rule",
    "Facebook hate-speech boycott had little effect on revenue",
    "SpaceX Crew Dragon Splashes Down in the Gulf of Mexico",
    "Why Can't We All Just Get Along? Uncertain Biological Basis of Morality (2013)",
    "\u201cZombie cicadas\u201d infected with mind-controlling fungus return to West Virginia",
    "What is a Product Roadmap?",
    "Brain-Gut Circuit Lets Microbiota Directly Affect the Sympathetic Nervous System",
    "How to Run Turing Machines on Encrypted Data [pdf]",
    "A collection of books, talks, and papers on security engineering",
    "Rethinking the Science of Skin",
    "GITenberg is an open source community for publishing ebooks in the public domain",
    "How real are real numbers? (2004)",
    "Beyond Bitswap",
    "What I Learned About Failing from My 5 Year Indie Game Dev Project",
    "The Architecture of the Medieval Page (2018)",
    "I Still Use an Old PowerPC Mac in 2020",
    "Microsoft to continue discussions on potential TikTok purchase in the US",
    "Lord and Taylor, Oldest U.S. Department Store, Files Bankruptcy",
    "\u03bcPlot v1.1 \u2013 now with log scales support",
    "OCaml for the Skeptical: OCaml in a Nutshell (2006)",
    "GPU Accelerated JavaScript",
    "Show HN: Create beautiful landing pages by copy-paste",
    "More"
]

Screenshot

You are able to take many screenshots when you use a browser. There is a dedicated page to screenshot feature available here.

Resource Tracking

Intermediate requests from XHR or Fetch called from javascript are tracked. We provide you the request with headers, method, body, and the same for the response. You can also retrieve the content Local Storage and Session Storage. Those data are available in result section of our API response response['result']['browser_data']

...
"result": {
    ...,
    "browser_data": {
        "xhr_call": [
            {
                "url": "https://aan.amazon.fr/cem",
                "headers": {
                    "user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36",
                    "content-type": "application/json",
                    "accept": "*/*",
                    "referer": "https://images-eu.ssl-images-amazon.com/images/G/08/ape/sf/whitelisted/desktop/sf-1.50.628cb61._V408130105_.html"
                },
                "method": "POST",
                "body": "{\"render_id\":\"4a7152f0-cb58-4de8-b152-f0cb58cde8a2\",\"event_type\":\"impression\",\"dimensions\":{\"subtype\":\"impression\",\"value\":1,\"template_name\":\"Dynamic eCommerce - universal\"}}"
            },
            {
                "url": "https://www.amazon.fr/gp/customer-reviews/aj/private/reviewsGallery/get-image-gallery-assets",
                "headers": {
                    "rtt": "0",
                    "accept": "text/html,*/*",
                    "x-requested-with": "XMLHttpRequest",
                    "downlink": "10",
                    "ect": "4g",
                    "user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36",
                    "content-type": "application/x-www-form-urlencoded",
                    "referer": "https://www.amazon.fr/gp/product/B008AVQXDO?pf_rd_r=APG7NKFQ8DTBPK2TEN8R&pf_rd_p=70373c30-7461-4a24-bb1f-f3fde4f2df3a",
                    "cookie": "session-id=261-7851197-2783504; i18n-prefs=EUR; ubid-acbfr=262-5387700-5547500; session-id-time=2082754801l; x-wl-uid=145H5Y5j+m7oe7NpElaItmpA5YWGFqUy34ZvPnc+Yd8m+UIZC49+YTzyieSn/K4Kfq162NF1AbZo=; session-token=aLl1Sgktrzq+wYbYCVAKoXJA+3aIAhtP36mNtxkpZORbiSqd3ur/uaU6W1aHycEtUy4LpAJrcV2YmGqNHYb4trXCj3Wt4Vxc5W/aCaww5HctUNsijeRB2Dxp/ca1gtYdEEpTJGBprLlnrFg85RsOkfiWb9nysakwy54GjF9aOjksmN0ip3XCgDbO9uIZ7/X8lgM7pTDy7tTVBJtRvK79S/k9PbfDxEjXULIpNE8iYBdTvm95Xevgmgr1nouA1frzwUFYYzhCg1k=; csm-hit=tb:s-5B0K136YR4QK89MQ8RG0|1596420691120&t:1596420692684&adb:adblk_no"
                },
                "method": "POST",
                "body": null
            },
            {
                "url": "https://www.amazon.fr/gp/customer-reviews/aj/private/reviewsGallery/get-application-resources-for-reviews-gallery",
                "headers": {
                    "rtt": "0",
                    "accept": "*/*",
                    "x-requested-with": "XMLHttpRequest",
                    "downlink": "10",
                    "ect": "4g",
                    "user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36",
                    "content-type": "application/x-www-form-urlencoded",
                    "referer": "https://www.amazon.fr/gp/product/B008AVQXDO?pf_rd_r=APG7NKFQ8DTBPK2TEN8R&pf_rd_p=70373c30-7461-4a24-bb1f-f3fde4f2df3a",
                    "cookie": "session-id=261-7851197-2783504; i18n-prefs=EUR; ubid-acbfr=262-5387700-5547500; session-id-time=2082754801l; x-wl-uid=145H5Y5j+m7oe7NpElaItmpA5YWGFqUy34ZvPnc+Yd8m+UIZC49+YTzyieSn/K4Kfq162NF1AbZo=; session-token=aLl1Sgktrzq+wYbYCVAKoXJA+3aIAhtP36mNtxkpZORbiSqd3ur/uaU6W1aHycEtUy4LpAJrcV2YmGqNHYb4trXCj3Wt4Vxc5W/aCaww5HctUNsijeRB2Dxp/ca1gtYdEEpTJGBprLlnrFg85RsOkfiWb9nysakwy54GjF9aOjksmN0ip3XCgDbO9uIZ7/X8lgM7pTDy7tTVBJtRvK79S/k9PbfDxEjXULIpNE8iYBdTvm95Xevgmgr1nouA1frzwUFYYzhCg1k=; csm-hit=tb:s-5B0K136YR4QK89MQ8RG0|1596420691120&t:1596420692684&adb:adblk_no"
                },
                "method": "POST",
                "body": "noCache=1596420693002"
            },
            {
                "url": "https://www.amazon.fr/gp/cerberus/gv",
                "headers": {
                    "rtt": "0",
                    "user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36",
                    "content-type": "application/x-www-form-urlencoded",
                    "accept": "*/*",
                    "cache-control": "no-cache",
                    "x-requested-with": "XMLHttpRequest",
                    "downlink": "10",
                    "ect": "4g",
                    "referer": "https://www.amazon.fr/gp/product/B008AVQXDO?pf_rd_r=APG7NKFQ8DTBPK2TEN8R&pf_rd_p=70373c30-7461-4a24-bb1f-f3fde4f2df3a",
                    "cookie": "session-id=261-7851197-2783504; i18n-prefs=EUR; ubid-acbfr=262-5387700-5547500; session-id-time=2082754801l; x-wl-uid=145H5Y5j+m7oe7NpElaItmpA5YWGFqUy34ZvPnc+Yd8m+UIZC49+YTzyieSn/K4Kfq162NF1AbZo=; session-token=aLl1Sgktrzq+wYbYCVAKoXJA+3aIAhtP36mNtxkpZORbiSqd3ur/uaU6W1aHycEtUy4LpAJrcV2YmGqNHYb4trXCj3Wt4Vxc5W/aCaww5HctUNsijeRB2Dxp/ca1gtYdEEpTJGBprLlnrFg85RsOkfiWb9nysakwy54GjF9aOjksmN0ip3XCgDbO9uIZ7/X8lgM7pTDy7tTVBJtRvK79S/k9PbfDxEjXULIpNE8iYBdTvm95Xevgmgr1nouA1frzwUFYYzhCg1k=; csm-hit=tb:s-5B0K136YR4QK89MQ8RG0|1596420691120&t:1596420692684&adb:adblk_no"
                },
                "method": "POST",
                "body": "payload=%7B%22producerId%22%3A%22detail-page%22%2C%22asin%22%3A%22B008AVQXDO%22%2C%22asin_price%22%3A%229.49%22%2C%22asin_shipping_price%22%3A%220%22%2C%22asin_currency_code%22%3A%22EUR%22%2C%22device_type%22%3A%22WEB%22%2C%22display_code%22%3A%22Asin+is+not+eligible+because+it+has+a+retail+offer%22%2C%22substitute_count%22%3A%22-1%22%7D"
            }
        ],
        "local_storage_data": {
            "csm-hit": "tb:s-5B0K136YR4QK89MQ8RG0|1596420691120&t:1596420692684&adb:adblk_no",
            "csm:adb": "adblk_no",
            "csm-bf": "[\"5B0K136YR4QK89MQ8RG0\"]",
            "a-font-class": "a-ember"
        },
        "session_storage_data": {
            "csm-hit": "tb:s-5B0K136YR4QK89MQ8RG0|1596420691120&t:1596420692684&adb:adblk_no",
            "csm:adb": "adblk_no",
            "csm-bf": "[\"5B0K136YR4QK89MQ8RG0\"]",
            "a-font-class": "a-ember"
        },
        "websockets": [],
        "javascript_evaluation_result": null
    },
    ...
}
...

Limitations

Javascript rendering feature is only available with GET method. You can't use a browser to send POST, PATCH, PUT, HEAD requests.

Following XHR / Fetched resources are not tracked:

  • Fonts: .woff, .woff2, .otf, .ttf
  • Media: .webm, .oga, .aac, .m4a, .mp3, .wav, .mp4
  • Image: .svg, .png, .gif, .jpg, .jpeg, .ico
  • Style: .css
  • Other: .pbf

You can retrieve an emitted XHR call with associate URL, headers, body, and method. We do not attach the response. If you need the response content, you can simply directly call the XHR URL.

It's not possible to directly download the media/image content with a browser, it will load the image url in html document and img tag to display it. Without the browser you will retrieve the base64 of the binary content.
...
"result": {
    ...,
    "browser_data": {
        "xhr_call": [
            {
                "url": "https://aan.amazon.fr/cem",
                "headers": {
                    "user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36",
                    "content-type": "application/json",
                    "accept": "*/*",
                    "referer": "https://images-eu.ssl-images-amazon.com/images/G/08/ape/sf/whitelisted/desktop/sf-1.50.628cb61._V408130105_.html"
                },
                "method": "POST",
                "body": "{\"render_id\":\"4a7152f0-cb58-4de8-b152-f0cb58cde8a2\",\"event_type\":\"impression\",\"dimensions\":{\"subtype\":\"impression\",\"value\":1,\"template_name\":\"Dynamic eCommerce - universal\"}}"
            },
            {
                "url": "https://www.amazon.fr/gp/customer-reviews/aj/private/reviewsGallery/get-image-gallery-assets",
                "headers": {
                    "rtt": "0",
                    "accept": "text/html,*/*",
                    "x-requested-with": "XMLHttpRequest",
                    "downlink": "10",
                    "ect": "4g",
                    "user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36",
                    "content-type": "application/x-www-form-urlencoded",
                    "referer": "https://www.amazon.fr/gp/product/B008AVQXDO?pf_rd_r=APG7NKFQ8DTBPK2TEN8R&pf_rd_p=70373c30-7461-4a24-bb1f-f3fde4f2df3a",
                    "cookie": "session-id=261-7851197-2783504; i18n-prefs=EUR; ubid-acbfr=262-5387700-5547500; session-id-time=2082754801l; x-wl-uid=145H5Y5j+m7oe7NpElaItmpA5YWGFqUy34ZvPnc+Yd8m+UIZC49+YTzyieSn/K4Kfq162NF1AbZo=; session-token=aLl1Sgktrzq+wYbYCVAKoXJA+3aIAhtP36mNtxkpZORbiSqd3ur/uaU6W1aHycEtUy4LpAJrcV2YmGqNHYb4trXCj3Wt4Vxc5W/aCaww5HctUNsijeRB2Dxp/ca1gtYdEEpTJGBprLlnrFg85RsOkfiWb9nysakwy54GjF9aOjksmN0ip3XCgDbO9uIZ7/X8lgM7pTDy7tTVBJtRvK79S/k9PbfDxEjXULIpNE8iYBdTvm95Xevgmgr1nouA1frzwUFYYzhCg1k=; csm-hit=tb:s-5B0K136YR4QK89MQ8RG0|1596420691120&t:1596420692684&adb:adblk_no"
                },
                "method": "POST",
                "body": null
            },
            {
                "url": "https://www.amazon.fr/gp/customer-reviews/aj/private/reviewsGallery/get-application-resources-for-reviews-gallery",
                "headers": {
                    "rtt": "0",
                    "accept": "*/*",
                    "x-requested-with": "XMLHttpRequest",
                    "downlink": "10",
                    "ect": "4g",
                    "user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36",
                    "content-type": "application/x-www-form-urlencoded",
                    "referer": "https://www.amazon.fr/gp/product/B008AVQXDO?pf_rd_r=APG7NKFQ8DTBPK2TEN8R&pf_rd_p=70373c30-7461-4a24-bb1f-f3fde4f2df3a",
                    "cookie": "session-id=261-7851197-2783504; i18n-prefs=EUR; ubid-acbfr=262-5387700-5547500; session-id-time=2082754801l; x-wl-uid=145H5Y5j+m7oe7NpElaItmpA5YWGFqUy34ZvPnc+Yd8m+UIZC49+YTzyieSn/K4Kfq162NF1AbZo=; session-token=aLl1Sgktrzq+wYbYCVAKoXJA+3aIAhtP36mNtxkpZORbiSqd3ur/uaU6W1aHycEtUy4LpAJrcV2YmGqNHYb4trXCj3Wt4Vxc5W/aCaww5HctUNsijeRB2Dxp/ca1gtYdEEpTJGBprLlnrFg85RsOkfiWb9nysakwy54GjF9aOjksmN0ip3XCgDbO9uIZ7/X8lgM7pTDy7tTVBJtRvK79S/k9PbfDxEjXULIpNE8iYBdTvm95Xevgmgr1nouA1frzwUFYYzhCg1k=; csm-hit=tb:s-5B0K136YR4QK89MQ8RG0|1596420691120&t:1596420692684&adb:adblk_no"
                },
                "method": "POST",
                "body": "noCache=1596420693002"
            },
            {
                "url": "https://www.amazon.fr/gp/cerberus/gv",
                "headers": {
                    "rtt": "0",
                    "user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36",
                    "content-type": "application/x-www-form-urlencoded",
                    "accept": "*/*",
                    "cache-control": "no-cache",
                    "x-requested-with": "XMLHttpRequest",
                    "downlink": "10",
                    "ect": "4g",
                    "referer": "https://www.amazon.fr/gp/product/B008AVQXDO?pf_rd_r=APG7NKFQ8DTBPK2TEN8R&pf_rd_p=70373c30-7461-4a24-bb1f-f3fde4f2df3a",
                    "cookie": "session-id=261-7851197-2783504; i18n-prefs=EUR; ubid-acbfr=262-5387700-5547500; session-id-time=2082754801l; x-wl-uid=145H5Y5j+m7oe7NpElaItmpA5YWGFqUy34ZvPnc+Yd8m+UIZC49+YTzyieSn/K4Kfq162NF1AbZo=; session-token=aLl1Sgktrzq+wYbYCVAKoXJA+3aIAhtP36mNtxkpZORbiSqd3ur/uaU6W1aHycEtUy4LpAJrcV2YmGqNHYb4trXCj3Wt4Vxc5W/aCaww5HctUNsijeRB2Dxp/ca1gtYdEEpTJGBprLlnrFg85RsOkfiWb9nysakwy54GjF9aOjksmN0ip3XCgDbO9uIZ7/X8lgM7pTDy7tTVBJtRvK79S/k9PbfDxEjXULIpNE8iYBdTvm95Xevgmgr1nouA1frzwUFYYzhCg1k=; csm-hit=tb:s-5B0K136YR4QK89MQ8RG0|1596420691120&t:1596420692684&adb:adblk_no"
                },
                "method": "POST",
                "body": "payload=%7B%22producerId%22%3A%22detail-page%22%2C%22asin%22%3A%22B008AVQXDO%22%2C%22asin_price%22%3A%229.49%22%2C%22asin_shipping_price%22%3A%220%22%2C%22asin_currency_code%22%3A%22EUR%22%2C%22device_type%22%3A%22WEB%22%2C%22display_code%22%3A%22Asin+is+not+eligible+because+it+has+a+retail+offer%22%2C%22substitute_count%22%3A%22-1%22%7D"
            }
        ],
        "local_storage_data": {
            "csm-hit": "tb:s-5B0K136YR4QK89MQ8RG0|1596420691120&t:1596420692684&adb:adblk_no",
            "csm:adb": "adblk_no",
            "csm-bf": "[\"5B0K136YR4QK89MQ8RG0\"]",
            "a-font-class": "a-ember"
        },
        "session_storage_data": {
            "csm-hit": "tb:s-5B0K136YR4QK89MQ8RG0|1596420691120&t:1596420692684&adb:adblk_no",
            "csm:adb": "adblk_no",
            "csm-bf": "[\"5B0K136YR4QK89MQ8RG0\"]",
            "a-font-class": "a-ember"
        },
        "websockets": [],
        "javascript_evaluation_result": null
    },
    ...
}
...

All related errors are listed below. You can see the full description and examples of errors response on Errors section

Pricing

Using JavaScript rendering will cost 5 Scrape API Credits against your quota. Keep in mind JavaScript Rendering is slow and uses many data/resources. For maximum performance, you should avoid it when it's not required.

API Response contains header X-Scrapfly-Api-Cost indicate you the billed amount.