# Scrapfly Documentation

## Table of Contents

### Dashboard

- [Intro](https://scrapfly.io/docs)
- [Project](https://scrapfly.io/docs/project)
- [Account](https://scrapfly.io/docs/account)
- [Workspace & Team](https://scrapfly.io/docs/workspace-and-team)
- [Billing](https://scrapfly.io/docs/billing)

### Products

#### MCP Server

- [Getting Started](https://scrapfly.io/docs/mcp/getting-started)
- [Tools & API Spec](https://scrapfly.io/docs/mcp/tools)
- [Authentication](https://scrapfly.io/docs/mcp/authentication)
- [Examples & Use Cases](https://scrapfly.io/docs/mcp/examples)
- [FAQ](https://scrapfly.io/docs/mcp/faq)
##### Integrations

- [Overview](https://scrapfly.io/docs/mcp/integrations)
- [Claude Desktop](https://scrapfly.io/docs/mcp/integrations/claude-desktop)
- [Claude Code](https://scrapfly.io/docs/mcp/integrations/claude-code)
- [ChatGPT](https://scrapfly.io/docs/mcp/integrations/chatgpt)
- [Cursor](https://scrapfly.io/docs/mcp/integrations/cursor)
- [Cline](https://scrapfly.io/docs/mcp/integrations/cline)
- [Windsurf](https://scrapfly.io/docs/mcp/integrations/windsurf)
- [Zed](https://scrapfly.io/docs/mcp/integrations/zed)
- [Roo Code](https://scrapfly.io/docs/mcp/integrations/roo-code)
- [VS Code](https://scrapfly.io/docs/mcp/integrations/vscode)
- [LangChain](https://scrapfly.io/docs/mcp/integrations/langchain)
- [LlamaIndex](https://scrapfly.io/docs/mcp/integrations/llamaindex)
- [CrewAI](https://scrapfly.io/docs/mcp/integrations/crewai)
- [OpenAI](https://scrapfly.io/docs/mcp/integrations/openai)
- [n8n](https://scrapfly.io/docs/mcp/integrations/n8n)
- [Make](https://scrapfly.io/docs/mcp/integrations/make)
- [Zapier](https://scrapfly.io/docs/mcp/integrations/zapier)
- [Vapi AI](https://scrapfly.io/docs/mcp/integrations/vapi)
- [Agent Builder](https://scrapfly.io/docs/mcp/integrations/agent-builder)
- [Custom Client](https://scrapfly.io/docs/mcp/integrations/custom-client)


#### Web Scraping API

- [Getting Started](https://scrapfly.io/docs/scrape-api/getting-started)
- [API Specification]()
- [Monitoring](https://scrapfly.io/docs/monitoring)
- [Customize Request](https://scrapfly.io/docs/scrape-api/custom)
- [Debug](https://scrapfly.io/docs/scrape-api/debug)
- [Anti Scraping Protection](https://scrapfly.io/docs/scrape-api/anti-scraping-protection)
- [Proxy](https://scrapfly.io/docs/scrape-api/proxy)
- [Proxy Mode](https://scrapfly.io/docs/scrape-api/proxy-mode)
- [Proxy Mode - Screaming Frog](https://scrapfly.io/docs/scrape-api/proxy-mode/screaming-frog)
- [Proxy Mode - Apify](https://scrapfly.io/docs/scrape-api/proxy-mode/apify)
- [(Auto) Data Extraction](https://scrapfly.io/docs/scrape-api/extraction)
- [Javascript Rendering](https://scrapfly.io/docs/scrape-api/javascript-rendering)
- [Javascript Scenario](https://scrapfly.io/docs/scrape-api/javascript-scenario)
- [SSL](https://scrapfly.io/docs/scrape-api/ssl)
- [DNS](https://scrapfly.io/docs/scrape-api/dns)
- [Cache](https://scrapfly.io/docs/scrape-api/cache)
- [Session](https://scrapfly.io/docs/scrape-api/session)
- [Webhook](https://scrapfly.io/docs/scrape-api/webhook)
- [Screenshot](https://scrapfly.io/docs/scrape-api/screenshot)
- [Errors](https://scrapfly.io/docs/scrape-api/errors)
- [Timeout](https://scrapfly.io/docs/scrape-api/understand-timeout)
- [Throttling](https://scrapfly.io/docs/throttling)
- [Troubleshoot](https://scrapfly.io/docs/scrape-api/troubleshoot)
- [Billing](https://scrapfly.io/docs/scrape-api/billing)
- [FAQ](https://scrapfly.io/docs/scrape-api/faq)

#### Crawler API

- [Getting Started](https://scrapfly.io/docs/crawler-api/getting-started)
- [API Specification]()
- [Retrieving Results](https://scrapfly.io/docs/crawler-api/results)
- [WARC Format](https://scrapfly.io/docs/crawler-api/warc-format)
- [Data Extraction](https://scrapfly.io/docs/crawler-api/extraction-rules)
- [Webhook](https://scrapfly.io/docs/crawler-api/webhook)
- [Billing](https://scrapfly.io/docs/crawler-api/billing)
- [Errors](https://scrapfly.io/docs/crawler-api/errors)
- [Troubleshoot](https://scrapfly.io/docs/crawler-api/troubleshoot)
- [FAQ](https://scrapfly.io/docs/crawler-api/faq)

#### Screenshot API

- [Getting Started](https://scrapfly.io/docs/screenshot-api/getting-started)
- [API Specification]()
- [Accessibility Testing](https://scrapfly.io/docs/screenshot-api/accessibility)
- [Webhook](https://scrapfly.io/docs/screenshot-api/webhook)
- [Billing](https://scrapfly.io/docs/screenshot-api/billing)
- [Errors](https://scrapfly.io/docs/screenshot-api/errors)

#### Extraction API

- [Getting Started](https://scrapfly.io/docs/extraction-api/getting-started)
- [API Specification]()
- [Rules Template](https://scrapfly.io/docs/extraction-api/rules-and-template)
- [LLM Extraction](https://scrapfly.io/docs/extraction-api/llm-prompt)
- [AI Auto Extraction](https://scrapfly.io/docs/extraction-api/automatic-ai)
- [Webhook](https://scrapfly.io/docs/extraction-api/webhook)
- [Billing](https://scrapfly.io/docs/extraction-api/billing)
- [Errors](https://scrapfly.io/docs/extraction-api/errors)
- [FAQ](https://scrapfly.io/docs/extraction-api/faq)

#### Proxy Saver

- [Getting Started](https://scrapfly.io/docs/proxy-saver/getting-started)
- [Fingerprints](https://scrapfly.io/docs/proxy-saver/fingerprints)
- [Optimizations](https://scrapfly.io/docs/proxy-saver/optimizations)
- [SSL Certificates](https://scrapfly.io/docs/proxy-saver/certificates)
- [Protocols](https://scrapfly.io/docs/proxy-saver/protocols)
- [Pacfile](https://scrapfly.io/docs/proxy-saver/pacfile)
- [Secure Credentials](https://scrapfly.io/docs/proxy-saver/security)
- [Billing](https://scrapfly.io/docs/proxy-saver/billing)

#### Cloud Browser API

- [Getting Started](https://scrapfly.io/docs/cloud-browser-api/getting-started)
- [Proxy & Geo-Targeting](https://scrapfly.io/docs/cloud-browser-api/proxy)
- [Unblock API](https://scrapfly.io/docs/cloud-browser-api/unblock)
- [File Downloads](https://scrapfly.io/docs/cloud-browser-api/file-downloads)
- [Session Resume](https://scrapfly.io/docs/cloud-browser-api/session-resume)
- [Human-in-the-Loop](https://scrapfly.io/docs/cloud-browser-api/human-in-the-loop)
- [Debug Mode](https://scrapfly.io/docs/cloud-browser-api/debug-mode)
- [Bring Your Own Proxy](https://scrapfly.io/docs/cloud-browser-api/bring-your-own-proxy)
- [Browser Extensions](https://scrapfly.io/docs/cloud-browser-api/extensions)
##### Integrations

- [Puppeteer](https://scrapfly.io/docs/cloud-browser-api/puppeteer)
- [Playwright](https://scrapfly.io/docs/cloud-browser-api/playwright)
- [Selenium](https://scrapfly.io/docs/cloud-browser-api/selenium)
- [Vercel Agent Browser](https://scrapfly.io/docs/cloud-browser-api/agent-browser)
- [Browser Use](https://scrapfly.io/docs/cloud-browser-api/browser-use)
- [Stagehand](https://scrapfly.io/docs/cloud-browser-api/stagehand)

- [Billing](https://scrapfly.io/docs/cloud-browser-api/billing)
- [Errors](https://scrapfly.io/docs/cloud-browser-api/errors)


### Tools

- [Antibot Detector](https://scrapfly.io/docs/tools/antibot-detector)

### SDK

- [Golang](https://scrapfly.io/docs/sdk/golang)
- [Python](https://scrapfly.io/docs/sdk/python)
- [Rust](https://scrapfly.io/docs/sdk/rust)
- [TypeScript](https://scrapfly.io/docs/sdk/typescript)
- [Scrapy](https://scrapfly.io/docs/sdk/scrapy)

### Integrations

- [Getting Started](https://scrapfly.io/docs/integration/getting-started)
- [LangChain](https://scrapfly.io/docs/integration/langchain)
- [LlamaIndex](https://scrapfly.io/docs/integration/llamaindex)
- [CrewAI](https://scrapfly.io/docs/integration/crewai)
- [Zapier](https://scrapfly.io/docs/integration/zapier)
- [Make](https://scrapfly.io/docs/integration/make)
- [n8n](https://scrapfly.io/docs/integration/n8n)

### Academy

- [Overview](https://scrapfly.io/academy)
- [Web Scraping Overview](https://scrapfly.io/academy/scraping-overview)
- [Tools](https://scrapfly.io/academy/tools-overview)
- [Reverse Engineering](https://scrapfly.io/academy/reverse-engineering)
- [Static Scraping](https://scrapfly.io/academy/static-scraping)
- [HTML Parsing](https://scrapfly.io/academy/html-parsing)
- [Dynamic Scraping](https://scrapfly.io/academy/dynamic-scraping)
- [Hidden API Scraping](https://scrapfly.io/academy/hidden-api-scraping)
- [Headless Browsers](https://scrapfly.io/academy/headless-browsers)
- [Hidden Web Data](https://scrapfly.io/academy/hidden-web-data)
- [JSON Parsing](https://scrapfly.io/academy/json-parsing)
- [Data Processing](https://scrapfly.io/academy/data-processing)
- [Scaling](https://scrapfly.io/academy/scaling)
- [Walkthrough Summary](https://scrapfly.io/academy/walkthrough-summary)
- [Scraper Blocking](https://scrapfly.io/academy/scraper-blocking)
- [Proxies](https://scrapfly.io/academy/proxies)

---

# Web Scraping with Scrapfly and Typescript

 [  View as markdown ](https://scrapfly.io/?view=markdown)   Copy for LLM    Copy for LLM  [     Open in ChatGPT ](https://chatgpt.com/?hints=search&prompt=Read%20from%20https%3A%2F%2Fscrapfly.io%2Fdocs%2Fonboarding%2Ftypescript%20so%20I%20can%20ask%20questions%20about%20it.) [     Open in Claude ](https://claude.ai/new?q=Read%20from%20https%3A%2F%2Fscrapfly.io%2Fdocs%2Fonboarding%2Ftypescript%20so%20I%20can%20ask%20questions%20about%20it.) [     Open in Perplexity ](https://www.perplexity.ai/search/new?q=Read%20from%20https%3A%2F%2Fscrapfly.io%2Fdocs%2Fonboarding%2Ftypescript%20so%20I%20can%20ask%20questions%20about%20it.) 

 

 

 Scrapfly Typescript SDK is powerful but intuitive and on this onboarding page we'll take a look at how to install, use it and some examples.

 To start, take a look at our introduction and overview video:

  

> If you're not ready to code yet check out Scrapfly's [Visual API Player](https://scrapfly.io/dashboard/playground/web-scraper) or the no-code [Zapier integration](https://scrapfly.io/docs/integration/zapier).

## SDK Setup

 Source code of Typescript SDK is available on [ Github](https://github.com/scrapfly/typescript-scrapfly) and the **scrapfly-sdk** package is available in all Javascript and Typescript runtimes:

- [Deno](#deno)
- [Bun](#bun)
- [NodeJS](#nodejs)
- [Serverless](#serverless)
 
 

 [Deno](https://deno.com/) is a modern and secure runtime for JavaScript and TypeScript that uses V8 and is built in Rust. It's incredibly easy to use and runs Typescript natively as well as being backwards compatible with NodeJS. This makes Deno a great option for web-scraping related development.

 To setup Scrapfly SDK with Deno, first install the SDK through [jsr.io package index](https://jsr.io/@scrapfly/scrapfly-sdk):

 ```
$ deno add jsr:@scrapfly/scrapfly-sdk
```

 

   

 

 Try out the following code snippet for Web Scraping API to get started:

 ```
import {
  ScrapflyClient, ScrapeConfig,
} from 'jsr:@scrapfly/scrapfly-sdk';

const client = new ScrapflyClient({ key: "" });

let scrape_result = await client.scrape(
  new ScrapeConfig({
    url: 'https://httpbin.dev/html',
  }),
);
console.log(scrape_result.result.log_url);
console.log(scrape_result.result.content);
```

 

   

 

 [ See Examples on Github](https://github.com/scrapfly/typescript-scrapfly/tree/main/examples/deno) 

 

 [Bun](https://bun.sh/) is a modern runtime for JavaScript and TypeScript that is fully interchangeable with NodeJS. It's incredibly easy to use and runs Typescript natively which makes it a great option for web-scraping related development.

 To setup Scrapfly SDK with Bun, first install the SDK through [jsr.io package index](https://jsr.io/@scrapfly/scrapfly-sdk):

 ```
$ bunx jsr add @scrapfly/scrapfly-sdk
```

 

   

 

 Try out the following code snippet for Web Scraping API to get started:

 ```
import {
  ScrapflyClient, ScrapeConfig,
} from '@scrapfly/scrapfly-sdk';

const client = new ScrapflyClient({ key: "" });

let scrape_result = await client.scrape(
  new ScrapeConfig({
    url: 'https://httpbin.dev/html',
  }),
);
console.log(scrape_result.result.log_url);
console.log(scrape_result.result.content);
```

 

   

 

 [ See Examples on Github](https://github.com/scrapfly/typescript-scrapfly/tree/main/examples/bun) 

 

 [NodeJS](https://bun.sh/) is the classic Javascript server runtime and is supported by the SDK through both CommonJS and ESM modules.

 To setup Scrapfly SDK with Node, first install the SDK through [NPM package index](https://www.npmjs.com/package/scrapfly-sdk):

 ```
$ npm install scrapfly-sdk
```

 

   

 

 Try out the following code snippet for Web Scraping API to get started:

 ```
import {
  ScrapflyClient, ScrapeConfig, ScreenshotConfig, ExtractionConfig
} from 'scrapfly-sdk';


const client = new ScrapflyClient({ key: "" });

let scrape_result = await client.scrape(
  new ScrapeConfig({
    url: 'https://httpbin.dev/html',
  }),
);

console.log(scrape_result.result.log_url);
console.log(scrape_result.result.content);
```

 

   

 

 [ See Examples on Github](https://github.com/scrapfly/typescript-scrapfly/tree/main/examples/node_esm) 

 

 Serverless platforms like Cloudflare Workers, AWS Lambda etc. are also supported by Scrapfly SDK.

 Most serverless platforms can run full NodeJS, Python or other runtimes though there are a few exceptions and differences in runtime implementations.

 For the best experience see our recommended use through Denoflare 👇

 [ See Examples on Github](https://github.com/scrapfly/typescript-scrapfly/tree/main/examples/cloudflareworker_awslambda_supabase) 

 

 



> All SDK examples can be found on SDK's Github repository: 
>  [ github.com/scrapfly/typescript-scrapfly/tree/main/examples ](https://github.com/scrapfly/typescript-scrapfly/tree/main/examples)

## Web Scraping API

 In this section, we'll walk through the most important web scraping features step-by-step. After completing this walk through you should be proficient enough to conquer any website scraping with Scrapfly, so let's dive in!

### First Scrape

 To start let's take a look at a basic scrape of this simple product page [web-scraping.dev/product/1](https://web-scraping.dev/product/1).

    

  

  We'll scrape the page, see some optional parameters and then extract the product details using CSS selectors.

 
 ```
import { 
  ScrapflyClient, ScrapeConfig, ScreenshotConfig, ExtractionConfig, log 
} from 'jsr:@scrapfly/scrapfly-sdk';
// You can enable debug logs to see more details
log.setLevel('DEBUG');

// Create client
const client = new ScrapflyClient({ key: "{{ YOUR_API_KEY }}" });

// Send a request
let api_result = await client.scrape(
  new ScrapeConfig({
    url: 'https://web-scraping.dev/product/1',
  })
);

// See the results
console.log(api_result.context);  // metadata (top-level on ScrapeResult)
console.log(api_result.config);   // request data (top-level on ScrapeResult)
console.log(api_result.result.content);  // result html content (nested under .result)

// Parse content
let product = {
  "title": api_result.selector(
     "h3.product-title"
  ).text(),
  "price": api_result.selector(
      ".product-price"
  ).text(),
  "description": api_result.selector(
      ".product-description"
  ).text(),
}
console.log(product);
```

 

   

 

 Above, we first requested Scrapfly API to scrape the product page for us. Then, we used the `selector` attribute to parse the product details using CSS Selectors.

 ```
{
  title: "Box of Chocolate Candy",
  price: "$9.99 ",
  description: "Indulge your sweet tooth with our Box of Chocolate Candy. Each box contains an assortment of rich, flavorful chocolates with a smooth, creamy filling. Choose from a variety of flavors including zesty orange and sweet cherry. Whether you're looking for the perfect gift or just want to treat yourself, our Box of Chocolate Candy is sure to satisfy."
}
```

 

   

 

 This example is very easy but what if we need more complex request configurations? Next, let's take a look at available scraping request options.

### Request Customization

 All SDK requests are being configured through `ScrapeConfig` object attributes. Most attributes mirror API parameters. For more see [ More information request customization ](https://scrapfly.io/docs/scrape-api/custom)

 Here's a quick demo example:

 ```
import { ScrapflyClient, ScrapeConfig, ScreenshotConfig, ExtractionConfig, log } from 'jsr:@scrapfly/scrapfly-sdk';
// You can enable debug logs to see more details
log.setLevel('DEBUG');


const client = new ScrapflyClient({ key: "{{ YOUR_API_KEY }}" });

let api_result = await client.scrape(
  new ScrapeConfig({
    url: 'https://httpbin.dev/post',
    // select request method can be GET (default), POST, HEAD etc.
    method: 'POST',  
    // attach request body — pass a structured object; the SDK serializes it
    // to JSON automatically when Content-Type is application/json.
    data: { name: "scrapfly typescript" },
    // attach custom headers
    headers: {
      "Content-Type": "application/json",
      "Authorization": "Bearer 123",
    },
  })
);

```

 

   

 

 Using `ScrapeConfig` we can not only configure the outgoing scrape requests but we can also enable Scrapfly specific features.

### Developer Features

 There are a few important developer features that can be enabled to make the onboarding process a bit easier.

 The [debug](https://scrapfly.io/docs/scrape-api/getting-started#api_param_debug) parameter can be enabled to produce more details in the web log output and the [cache](https://scrapfly.io/docs/scrape-api/getting-started#api_param_cache) parameters are great for exploring the APIs while onboarding:

 ```
import { 
    ScrapflyClient, ScrapeConfig, ScreenshotConfig, ExtractionConfig, log 
} from 'jsr:@scrapfly/scrapfly-sdk';
// You can enable debug logs to see more details
log.setLevel('DEBUG');

const client = new ScrapflyClient({ key: "{{ YOUR_API_KEY }}" });
let api_result = await client.scrape(
  new ScrapeConfig({
    url: 'https://httpbin.dev/post',
    // when debug is set, scrapfly will record and save more details
    debug: true,  
    render_js: true,  // with debug and render_js it'll save a screenshot
    // cache can be enabled to save bandwidth and speed up the process and save credits
    cache: true,
    cache_ttl: 3600,  // cache time to live in seconds
    cache_clear: false,  // set to true to clear cache at any time.
  })
);

```

 

   

 

 By enabling `debug` we can see that the monitoring dashboard produces more details and even captures screenshots for reviewing!

    

  

 see the debug tab  [See Your Monitoring Dashboard](https://scrapfly.io/dashboard/monitoring)

 The next feature set allows us to super charge our scrapers with web browsers, let's take a look.

### Using Web Browsers

 Scrapfly can scrape using real web browsers and to enable that the [render\_js](https://scrapfly.io/docs/scrape-api/getting-started#api_param_render_js) parameter is used. When enabled instead of using HTTP request Scrapfly will:

1. Start a real web browser
2. Load the page
3. Optionally wait for page to load through [rendering\_wait](https://scrapfly.io/docs/scrape-api/getting-started#api_param_rendering_wait) or [wait\_for\_selector](https://scrapfly.io/docs/scrape-api/getting-started#api_param_wait_for_selector) options
4. Optionally execute custom Javascript code through [js](https://scrapfly.io/docs/scrape-api/getting-started#api_param_js) or [javascript\_scenario](https://scrapfly.io/docs/scrape-api/getting-started#api_param_javascript_scenario)
5. Return the rendered page content and browser data like captured background requests and database contents.
 
 This makes Scrapfly scrapers incredibly powerful and customizable! Let's take a look at some examples.

 To illustrate this let's take a look at this example page [web-scraping.dev/reviews](https://web-scraping.dev/reviews) which requires javascript to load:

    

  

 js disabled  

    

  

 js enabled  

 

 To scrape this we can use scrapfly's web browsers and we can approach this in two ways:

#### Rendering Javascript

 The first approach is to simply wait for the page to load and scrape the content:

 ```
import { 
  ScrapflyClient, ScrapeConfig, ScreenshotConfig, ExtractionConfig, log 
} from 'jsr:@scrapfly/scrapfly-sdk';

const client = new ScrapflyClient({ key: "{{ YOUR_API_KEY }}" });
let api_result = await client.scrape(
  new ScrapeConfig({
    url: 'https://web-scraping.dev/reviews',
    render_js: true,
    // wait for page element to appear
    wait_for_selector: ".review",
    // or wait for a specific time
    // rendering_wait: 3000,  // 3 seconds
  }),
);

const reviews: { date: string, text: string, stars: string}[] = [];
let sel = api_result.selector;
for (let review of sel(".review")) {
  reviews.push({
    "date": sel(review).find("span").eq(0).text(),
    "text": sel(review).find("p").text(),
    "stars": sel(review).find("svg").length,
  });
}
console.log(reviews);
// prints:
[
  {
    date: "2022-07-22",
    text: "Absolutely delicious! The orange flavor is my favorite.",
    stars: 5
  },
  {
    date: "2022-08-16",
    text: "I bought these as a gift, and they were well received. Will definitely purchase again.",
    stars: 4
  },
  {
    date: "2022-09-10",
    text: "Nice variety of flavors. The chocolate is rich and smooth.",
    stars: 5
  },
  {
    date: "2022-10-02",
    text: "The cherry flavor is amazing. Will be buying more.",
    stars: 5
  },
  {
    date: "2022-11-05",
    text: "A bit pricey, but the quality of the chocolate is worth it.",
    stars: 4
  }
]
```

 

   

 

 This approach is quite simple as we get exactly what we see in our own web browser making the development process easier.

#### XHR Capture

 The second approach is to capture the background requests that generate this data on load directly:

 ```
import { 
  ScrapflyClient, ScrapeConfig, ScreenshotConfig, ExtractionConfig, log 
} from 'jsr:@scrapfly/scrapfly-sdk';

const client = new ScrapflyClient({ key: "{{ YOUR_API_KEY }}" });
let api_result = await client.scrape(
  new ScrapeConfig({
    url: 'https://web-scraping.dev/reviews',
    // wait for page element to appear or explicit time
    render_js: true,
    wait_for_selector: ".review",
  }),
);

// the browser captures all xhr calls as an array
const all_xhr_calls = api_result.result.browser_data?.xhr_call ?? [];
// find the right call by inspecting request['body'] or request['url']
const reviews_call = all_xhr_calls.filter((call: any) => call['body'].includes('GetReviews'))[0];
// find the results by inspecting the call response['body']
const reviews = JSON.parse(reviews_call.response['body'])['data']['reviews']['edges'].map((el: any) => el['node']);
console.log(reviews);
// prints:
// [
//   {
//     rid: "teal-potion-4",
//     text: "Unique flavor and great energy boost. It's the perfect gamer's drink!",
//     rating: 5,
//     date: "2023-05-18"
//   },
//   {
//     rid: "red-potion-4",
//     text: "Good flavor and keeps me energized. The bottle design is really fun.",
//     rating: 5,
//     date: "2023-05-17"
//   },
//   // ... 18 more items
// ]

```

 

   

 

 The advantage of this approach is that we can capture direct JSON data and we don't need to parse anything! Though it is a bit more complex and requires some web development knowledge.

#### Browser Control

 Finally, we can fully control the entire browser. For example, we can use [Javascript Scenarios](https://scrapfly.io/docs/scrape-api/javascript-scenario) to enter username and password and click the login button to authenticate on [web-scraping.dev/login](https://web-scraping.dev/login):

1. We'll go to web-scraping.dev/login
2. Wait for page to load
3. Enter username to Username input
4. Enter password to Password input
5. click login
6. Wait for page to load
 
 Here's how that would look like visually:

    

  

  To achieve this using javascript scenarios all we have to do is describe this as a JSON template:

 ```
import { 
  ScrapflyClient, ScrapeConfig, ScreenshotConfig, ExtractionConfig, log 
} from 'jsr:@scrapfly/scrapfly-sdk';

const client = new ScrapflyClient({ key: "{{ YOUR_API_KEY }}" });
let scenario = [
    {"fill": {"selector": "input[name=username]", "value":"user123"}},
    {"fill": {"selector": "input[name=password]", "value":"password"}},
    {"click": {"selector": "form button[type='submit']"}},
    {"wait_for_navigation": {"timeout": 5000}}
]
let api_result = await client.scrape(
  new ScrapeConfig({
    url: 'https://web-scraping.dev/login',
    render_js: true,
    js_scenario: scenario,
  }),
);
console.log(api_result.result.log_url);
api_result.selector("#secret-message").text();
// prints
"🤫"
```

 

   

 

 Javascript scenarios really simplify the browser automation process though we can take this even further!

#### Javascript Execution

 For more experienced web developers there's a full javascript environment access available through the [js](https://scrapfly.io/docs/scrape-api/getting-started#api_param_js) parameter. For example let's execute some javascript parsing code using `querySelector()` method:

 ```
import { 
  ScrapflyClient, ScrapeConfig, ScreenshotConfig, ExtractionConfig, log 
} from 'jsr:@scrapfly/scrapfly-sdk';

const client = new ScrapflyClient({ key: "{{ YOUR_API_KEY }}" });
let js = `
return Array.from(
  document.querySelectorAll('.review > p')
).map(
  (el) => el.textContent
)
`
let api_result = await client.scrape(
  new ScrapeConfig({
    url: 'https://web-scraping.dev/reviews',
    render_js: true,
    js: js,
  }),
);
console.log(api_result.result.browser_data?.javascript_evaluation_result);
// will print:
// [
//   "Unique flavor and great energy boost. It's the perfect gamer's drink!",
//   "Good flavor and keeps me energized. The bottle design is really fun.",
//   "Excellent energy drink for gamers. The tropical flavor is refreshing.",
//   ...
// ]
```

 

   

 

 Here the browser executed the requested snippet of javascript and returned the results.

---

 With custom request options and cloud browsers you're really in control of every web scraping step! Next, let's see the feature that allow to access any web page without being blocked through proxies and ASP.

### Bypass Blocking

 Scraper blocking can be very difficult to understand so Scrapfly provides one setting that simplifies the scraper blocking bypass. The Anti Scraping Protection ([asp](https://scrapfly.io/docs/scrape-api/getting-started#api_param_asp)) bypass parameter will automatically configure requests and bypass most anti-scraping protection systems:

 ```
import { 
  ScrapflyClient, ScrapeConfig, ScreenshotConfig, ExtractionConfig, log 
} from 'jsr:@scrapfly/scrapfly-sdk';
const client = new ScrapflyClient({ key: "{{ YOUR_API_KEY }}" });
let api_result = await client.scrape(
  new ScrapeConfig({
    url: 'https://web-scraping.dev/product/1',
    // Enable Anti Scraping Protection bypass:
    asp: true,
  })
);


```

 

   

 

 While ASP can bypass most anti-scraping protection systems like Cloudflare, Datadome etc. some blocking techniques are based on geographic location or proxy type.

### Proxy Country

  what is a proxy? All Scrapfly requests go through a Proxy from over millions of IPs available from over 50+ countries. Some websites, however, are only available in specific region or simply block less connections from specific countries.

 For that the [country](https://scrapfly.io/docs/scrape-api/getting-started#api_param_country) parameter can be used to define what country proxies are used.

 ```
import { 
  ScrapflyClient, ScrapeConfig, ScreenshotConfig, ExtractionConfig, log 
} from 'jsr:@scrapfly/scrapfly-sdk';

const client = new ScrapflyClient({ key: "{{ YOUR_API_KEY }}" });
let api_result = await client.scrape(
  new ScrapeConfig({
    url: 'https://tools.scrapfly.io/api/info/ip',
    // Set which proxy countries can be used for this request:
    country: "US,CA",
  })
);
console.log(api_result.result.content);
// will print:
// {"country":"us","ip":"1.14.131.41"}
```

 

   

 

 Here we can see what proxy country scrapfly used when we query Scrapfly's IP analysis API tool.

### Proxy Type

 Further, Scrapfly offers two types of IPs: datacenter and residential. For targets that are harder to reach residential proxies can perform much better. Setting [proxy\_pool](https://scrapfly.io/docs/scrape-api/getting-started#api_param_proxy_pool) parameter to residential pool type we can switch to these stronger proxies:

 ```
import { 
  ScrapflyClient, ScrapeConfig, ScreenshotConfig, ExtractionConfig, log 
} from 'jsr:@scrapfly/scrapfly-sdk';

const client = new ScrapflyClient({ key: "{{ YOUR_API_KEY }}" });
let api_result = await client.scrape(
  new ScrapeConfig({
    url: 'https://tools.scrapfly.io/api/info/ip',
    // See for available pools: https://scrapfly.io/dashboard/proxy
    proxy_pool: "public_residential_pool",
  })
);
```

 

   

 

 [See Your Proxy Dashboard](https://scrapfly.io/dashboard/proxy)

### Concurrency Helper

  what is concurrency? The Typescript SDK is asynchronous each API call can be run concurrently natively and batched using native batching tools like `Promise.all()`. However, there's an additional concurrency helper that can simplify scrape batching.

 The `concurrentScrape()`

 See this example implementation:

 ```
import { 
  ScrapflyClient, ScrapeConfig, ScrapeResult
} from 'jsr:@scrapfly/scrapfly-sdk';

const client = new ScrapflyClient({ key: "{{ YOUR_API_KEY }}" });

// create 10 configs
let configs: ScrapeConfig[] = []
for (let i = 1; i <= 5; i++) {
    configs.push(new ScrapeConfig({
        url: `https://web-scraping.dev/product/${i}`,
    }));
}

// scrape all configs concurrently using async generator
const results: ScrapeResult[] = [];
const errors: Error[] = [];
for await (const resultOrError of client.concurrentScrape(configs, 5)) {
    if (resultOrError instanceof Error) {
        errors.push(resultOrError);
    } else {
        results.push(resultOrError);
    }
}
console.log(results);
console.log(errors);
```

 

   

 

 Here we used the asynchronous generator to scrape multiple pages concurrently. We can either set the `concurrency` parameter to a desired limit (here we used 5) or if omitted your account's max concurrency limit will be used.

---

 This covers the core functionalities of Scrapfly's Web Scraping API though there are many more features available. For more see [ the full API specification ](https://scrapfly.io/docs/scrape-api/getting-started)

 If you're having any issues see the [ FAQ](https://scrapfly.io/docs/scrape-api/faq) and [ Troubleshoot](https://scrapfly.io/docs/scrape-api/troubleshoot) pages.

## Extraction API

 Now that we know how to scrape data using Scrapfly's web scraping API we can start parsing it for information and for that Scrapfly's [Extraction API](https://scrapfly.io/docs/extraction-api/getting-started) is an ideal choice.

 Extraction API offers 3 ways to parse data: LLM prompts, Auto AI and custom extraction rules. All of which are available through the `extract()` method and `ExtractionConfig` object of Typescript SDK. Let's take a look at some examples.

### LLM Prompts

 Extraction API allows to prompt any text content [using LLM prompts](https://scrapfly.io/docs/extraction-api/llm-prompt). The prompts can be used to summarize content, answer questions about the content or generate structured data like JSON or CSV.

 As an example see this freeform prompt use with Python SDK:

 ```
import { 
  ScrapflyClient, ScrapeConfig, ScreenshotConfig, ExtractionConfig, log 
} from 'jsr:@scrapfly/scrapfly-sdk';

const client = new ScrapflyClient({ key: "{{ YOUR_API_KEY }}" });

// First retrieve your HTML or scrape it using web Scraping API
const html = (await client.scrape(
  new ScrapeConfig({
    url: 'https://web-scraping.dev/product/1',
  })
)).result.content;

// Then, extract data using extraction_prompt parameter:
let api_result = await client.extract(
  new ExtractionConfig({
    body: html,
    content_type: "text/html",
    extraction_prompt: "extract product price only",
  })
)

console.log(api_result);
// will print:
// {
//   content_type: "text/html",
//   data: "19.99",
// }
```

 

   

 

 LLMs are great for freeform or creative questions but for extracting known data types like products, reviews etc. there's a better option - AI Auto Extraction. Let's take a look at that next.

### Auto Extraction

 Scrapfly's Extraction API also includes a number of predefined models that can be used to [automatically extract common objects](https://scrapfly.io/docs/extraction-api/automatic-ai) like products, reviews, articles etc. without the need to write custom extraction rules.

 The predefined models are available through the `extraction_model` parameter of the `ExtractionConfig` object. For example, let's use the `product` model:

 ```
import { 
  ScrapflyClient, ScrapeConfig, ScreenshotConfig, ExtractionConfig, log 
} from 'jsr:@scrapfly/scrapfly-sdk';

const client = new ScrapflyClient({ key: "{{ YOUR_API_KEY }}" });

// First retrieve your HTML or scrape it using web Scraping API
const html = (await client.scrape(
  new ScrapeConfig({
    url: 'https://web-scraping.dev/product/1',
  })
)).result.content;

// Then, extract data using extraction_model parameter:
let api_result = await client.extract(
  new ExtractionConfig({
    body: html,
    content_type: "text/html",
    extraction_model: "product",
  })
)

console.log(JSON.stringify(api_result));
// will print

```

 

   

 

  See the result ```
{
    "data": {
        "aggregate_rating": null,
        "brand": "ChocoDelight",
        "breadcrumbs": null,
        "canonical_url": null,
        "color": null,
        "description": "Indulge your sweet tooth with our Box of Chocolate Candy. Each box contains an assortment of rich, flavorful chocolates with a smooth, creamy filling. Choose from a variety of flavors including zesty orange and sweet cherry. Whether you're looking for the perfect gift or just want to treat yourself, our Box of Chocolate Candy is sure to satisfy.",
        "identifiers": {
            "ean13": null,
            "gtin14": null,
            "gtin8": null,
            "isbn10": null,
            "isbn13": null,
            "ismn": null,
            "issn": null,
            "mpn": null,
            "sku": null,
            "upc": null
        },
        "images": [
            "https://www.web-scraping.dev/assets/products/orange-chocolate-box-small-1.webp",
            "https://www.web-scraping.dev/assets/products/orange-chocolate-box-small-2.webp",
            "https://www.web-scraping.dev/assets/products/orange-chocolate-box-small-3.webp",
            "https://www.web-scraping.dev/assets/products/orange-chocolate-box-small-4.webp"
        ],
        "main_category": "Products",
        "main_image": "https://www.web-scraping.dev/assets/products/orange-chocolate-box-small-1.webp",
        "name": "Box of Chocolate Candy",
        "offers": [
            {
                "availability": "available",
                "currency": "$",
                "price": 9.99,
                "regular_price": 12.99
            }
        ],
        "related_products": [
            {
                "availability": "available",
                "description": null,
                "images": [
                    {
                        "url": "https://www.web-scraping.dev/assets/products/dragon-potion.webp"
                    }
                ],
                "link": "https://web-scraping.dev/product/6",
                "name": "Dragon Energy Potion",
                "price": {
                    "amount": 4.99,
                    "currency": "$",
                    "raw": "4.99"
                }
            },
            {
                "availability": "available",
                "description": null,
                "images": [
                    {
                        "url": "https://www.web-scraping.dev/assets/products/men-running-shoes.webp"
                    }
                ],
                "link": "https://web-scraping.dev/product/9",
                "name": "Running Shoes for Men",
                "price": {
                    "amount": 49.99,
                    "currency": "$",
                    "raw": "49.99"
                }
            },
            {
                "availability": "available",
                "description": null,
                "images": [
                    {
                        "url": "https://www.web-scraping.dev/assets/products/women-sandals-beige-1.webp"
                    }
                ],
                "link": "https://web-scraping.dev/product/20",
                "name": "Women's High Heel Sandals",
                "price": {
                    "amount": 59.99,
                    "currency": "$",
                    "raw": "59.99"
                }
            },
            {
                "availability": "available",
                "description": null,
                "images": [
                    {
                        "url": "https://www.web-scraping.dev/assets/products/cat-ear-beanie-grey.webp"
                    }
                ],
                "link": "https://web-scraping.dev/product/12",
                "name": "Cat-Ear Beanie",
                "price": {
                    "amount": 14.99,
                    "currency": "$",
                    "raw": "14.99"
                }
            }
        ],
        "secondary_category": null,
        "size": null,
        "specifications": [
            {
                "name": "material",
                "value": "Premium quality chocolate"
            },
            {
                "name": "flavors",
                "value": "Available in Orange and Cherry flavors"
            },
            {
                "name": "sizes",
                "value": "Available in small, medium, and large boxes"
            },
            {
                "name": "brand",
                "value": "ChocoDelight"
            },
            {
                "name": "care instructions",
                "value": "Store in a cool, dry place"
            },
            {
                "name": "purpose",
                "value": "Ideal for gifting or self-indulgence"
            }
        ],
        "style": null,
        "url": "https://web-scraping.dev/",
        "variants": [
            {
                "color": "orange",
                "offers": [
                    {
                        "availability": "available",
                        "price": {
                            "amount": null,
                            "currency": null,
                            "raw": null
                        }
                    }
                ],
                "sku": null,
                "url": "https://web-scraping.dev/product/1?variant=orange-small"
            },
            {
                "color": "orange",
                "offers": [
                    {
                        "availability": "available",
                        "price": {
                            "amount": null,
                            "currency": null,
                            "raw": null
                        }
                    }
                ],
                "sku": null,
                "url": "https://web-scraping.dev/product/1?variant=orange-medium"
            },
            {
                "color": "orange",
                "offers": [
                    {
                        "availability": "available",
                        "price": {
                            "amount": null,
                            "currency": null,
                            "raw": null
                        }
                    }
                ],
                "sku": null,
                "url": "https://web-scraping.dev/product/1?variant=orange-large"
            },
            {
                "color": "cherry",
                "offers": [
                    {
                        "availability": "available",
                        "price": {
                            "amount": null,
                            "currency": null,
                            "raw": null
                        }
                    }
                ],
                "sku": null,
                "url": "https://web-scraping.dev/product/1?variant=cherry-small"
            },
            {
                "color": "cherry",
                "offers": [
                    {
                        "availability": "available",
                        "price": {
                            "amount": null,
                            "currency": null,
                            "raw": null
                        }
                    }
                ],
                "sku": null,
                "url": "https://web-scraping.dev/product/1?variant=cherry-medium"
            },
            {
                "color": "cherry",
                "offers": [
                    {
                        "availability": "available",
                        "price": {
                            "amount": null,
                            "currency": null,
                            "raw": null
                        }
                    }
                ],
                "sku": null,
                "url": "https://web-scraping.dev/product/1?variant=cherry-large"
            }
        ]
    },
    "content_type": "application/json"
}
```

 

   

 

 > For all available types see [Auto Extract Models](https://scrapfly.io/docs/extraction-api/automatic-ai#models) documentation.

 Auto Extraction is powerful but can be limited for unique niche scenarios where manual extraction can be more fit. For that, let's take a look at Extraction Templates next which let you define your own extraction rules through JSON schema.

### Extraction Templates

 For more specific data extraction Scrapfly Extraction API allows to define custom extraction rules.

 This is being done through a JSON schema which defines how data is selected through XPath or CSS selectors and how data is being processed through pre-defined processors and formatters.

 This is a great tool for developers who are already familiar with data parsing in web scraping. See this example:

 ```
import { 
  ScrapflyClient, ScrapeConfig, ScreenshotConfig, ExtractionConfig, log 
} from 'jsr:@scrapfly/scrapfly-sdk';

const client = new ScrapflyClient({ key: "{{ YOUR_API_KEY }}" });

// First retrieve your HTML or scrape it using web Scraping API
const html = (await client.scrape(
  new ScrapeConfig({
    url: 'https://web-scraping.dev/reviews',
    render_js: true,
    wait_for_selector: ".review",
  })
)).result.content;


// Then create your extraction template
let template = {  
  "source": "html",
  "selectors": [
    {
      "name": "date_posted",
      // use css selectors
      "type": "css",
      "query": "[data-testid='review-date']::text",
      "multiple": true,  // one or multiple?
      // post process results with formatters
      "formatters": [ {
        "name": "datetime",
        "args": {"format": "%Y, %b %d - %A"}
      } ]
    }
  ]
}

let api_result = await client.extract(
  new ExtractionConfig({
    body: html,
    content_type: "text/html",
    ephemeral_template: template,
  })
)



console.log(JSON.stringify(api_result));
// will print
// {
//   "data": {
//     "date_posted": [
//       "2023, May 18 - Thursday",
//       "2023, May 17 - Wednesday",
//       "2023, May 16 - Tuesday",
//       "2023, May 15 - Monday",
//       "2023, May 15 - Monday",
//       "2023, May 12 - Friday",
//       "2023, May 10 - Wednesday",
//       "2023, May 01 - Monday",
//       "2023, May 01 - Monday",
//       "2023, Apr 25 - Tuesday",
//       "2023, Apr 25 - Tuesday",
//       "2023, Apr 18 - Tuesday",
//       "2023, Apr 12 - Wednesday",
//       "2023, Apr 11 - Tuesday",
//       "2023, Apr 10 - Monday",
//       "2023, Apr 10 - Monday",
//       "2023, Apr 09 - Sunday",
//       "2023, Apr 07 - Friday",
//       "2023, Apr 07 - Friday",
//       "2023, Apr 05 - Wednesday"
//     ]
//   },
//   "content_type": "application/json"
// }
```

 

   

 

> For all available selectors, formatters and extractors see [Templates](https://scrapfly.io/docs/extraction-api/rules-and-template#models) documentation.

 Above we define a template that selects review dates using CSS selectors and then re-formats them to a new date format using datetime formatters.

---

 With this we can now scrape any page and extract any data we need! To wrap this up let's take a look at another data capture format next - Screenshot API.

## Screenshot API

 While it's possible to capture screenshots using web scraping API Scrapfly also includes a dedicated [screenshot API](https://scrapfly.io/docs/screenshot-api/getting-started) that significantly streamlines the screenshot scraping process.

 The Screenshot API can be accessed through the SDK's `screenshot()` method and configured through the `ScreenshotConfig` configuration object. Here's a basic example:

 ```
import { 
  ScrapflyClient, ScrapeConfig, ScreenshotConfig, ExtractionConfig, log 
} from 'jsr:@scrapfly/scrapfly-sdk';

const client = new ScrapflyClient({ key: "{{ YOUR_API_KEY }}" });
let api_result = await client.screenshot(
  new ScreenshotConfig({
    url:"https://web-scraping.dev/product/1"
  })
)

console.log(api_result.image);  // binary image
console.log(api_result.metadata);  // json metadata
```

 

   

 

    

  

 > The screenshot API also inherits many features from web-scraping API like `cache` , `webhook` and `cache` that are fully functional.

 Here all we did is provide an url to capture and the API has returned us a screenshot.

### Resolution

 Next, we can heavily customize how the screenshot is being captured. For example, we can change the viewport size from the default `1920x1080` to any other resolution like `540x1200` to simulate mobile views:

 ```
import { 
  ScrapflyClient, ScrapeConfig, ScreenshotConfig, ExtractionConfig, log 
} from 'jsr:@scrapfly/scrapfly-sdk';

const client = new ScrapflyClient({ key: "{{ YOUR_API_KEY }}" });
let api_result = await client.screenshot(
  new ScreenshotConfig({
    url:"https://web-scraping.dev/product/1",
    resolution: "540x1200",
  })
)

console.log(api_result.image);  // binary image
console.log(api_result.metadata);  // json metadata
```

 

   

 

    

  

  Further, we can tell Scrapfly to capture the entire page rather than just the viewport.

### Full Page

 Using the [capture](https://scrapfly.io/docs/screenshot-api/getting-started#api_param_capture) parameter we can tell scrapfly to capture `fullpage` which will capture everything that is visible on the page.

 ```
import { 
  ScrapflyClient, ScrapeConfig, ScreenshotConfig, ExtractionConfig, log 
} from 'jsr:@scrapfly/scrapfly-sdk';

const client = new ScrapflyClient({ key: "{{ YOUR_API_KEY }}" });
let api_result = await client.screenshot(
  new ScreenshotConfig({
    url:"https://web-scraping.dev/product/1",
    capture: "fullpage",
  })
)

console.log(api_result.image);  // binary image
console.log(api_result.metadata);  // json metadata
```

 

   

 

    

  

  Here by setting the `capture` parameter to `fullpage` we've captured the entire page. Though, if page requires scrolling to load more content we can also capture that using another parameter.

### Auto Scroll

 Just like with the Web Scraping API we can force automatic scroll on the page to load dynamic elements that load on scrolling. In this example, we're capturing a screenshot of [web-scraping.dev/testimonials](https://web-scraping.dev/testimonials) which loads new testimonial entries when the user scrolls the page:

 ```
import { 
  ScrapflyClient, ScrapeConfig, ScreenshotConfig, ExtractionConfig, log 
} from 'jsr:@scrapfly/scrapfly-sdk';

const client = new ScrapflyClient({ key: "{{ YOUR_API_KEY }}" });
let api_result = await client.screenshot(
  new ScreenshotConfig({
    url:"https://web-scraping.dev/testimonials",
    capture: "fullpage",
    auto_scroll: true,  // scroll to the bottom
  })
)

console.log(api_result.image);  // binary image
console.log(api_result.metadata);  // json metadata
```

 

   

 

    

  

  

 Here the auto scrolled to the very bottom and loaded all of the testimonials before screenshot capture.

 Next, we can capture only specific areas of the page. Let's take a look how.

### Capture Areas

 To capture specific areas we can use XPath or CSS selectors to define what to capture. For this, the `capture` parameter is used with the selector for an element to capture.

 For example, we can capture only the reviews section of [web-scraping.dev/product/1](https://web-scraping.dev/product/1) page:

 ```
import { 
  ScrapflyClient, ScrapeConfig, ScreenshotConfig, ExtractionConfig, log 
} from 'jsr:@scrapfly/scrapfly-sdk';

const client = new ScrapflyClient({ key: "{{ YOUR_API_KEY }}" });
let api_result = await client.screenshot(
  new ScreenshotConfig({
    url:"https://web-scraping.dev/product/1",
    capture: "#review",  // selector for <div id="review">...</div>
    wait_for_selector: "#review",   
  })
)

console.log(api_result.image);  // binary image
console.log(api_result.metadata);  // json metadata
```

 

   

 

    

  

  Here using a CSS selector we can restrict our capturing only to areas that are relevant to us.

 Finally, for more capture configurations we can use screenshot options, let's take a look at that next

### Capture Options

 Capture options can apply various page modifications to capture the page in a specific way. For example, using `block_banners` option we can block cookies banners and using the `dark_mode` we can apply a custom dark theme to the scraped page.

 ```
import { 
  ScrapflyClient, ScrapeConfig, ScreenshotConfig, ExtractionConfig, log, ScreenshotOptions
} from 'jsr:@scrapfly/scrapfly-sdk';

const client = new ScrapflyClient({ key: "{{ YOUR_API_KEY }}" });
let api_result = await client.screenshot(
  new ScreenshotConfig({
    url:"https://web-scraping.dev/login?cookies",
    options: [
      ScreenshotOptions.BLOCK_BANNERS,
      ScreenshotOptions.DARK_MODE,
    ],
  })
)

console.log(api_result.image);  // binary image
console.log(api_result.metadata);  // json metadata
```

 

   

 

    

  

  In this example we capture [web-scraping.dev/login?cookies](https://web-scraping.dev/login?cookies) page and disable cookie pop while also applying a dark theme.

---

## What's next?

 This concludes our onboarding tutorial though Scrapfly has many more features and options available. For that explore the getting started pages and api specification of each API as each of these features are available in all of Scrapfly SDKs and packages!

 As for more on web scraping techniques and educational material see the [ Scrapfly Web Scraping Academy](https://scrapfly.io/academy).