# Scrapfly Documentation

## Table of Contents

### Dashboard

- [Intro](https://scrapfly.io/docs)
- [Project](https://scrapfly.io/docs/project)
- [Account](https://scrapfly.io/docs/account)
- [Workspace & Team](https://scrapfly.io/docs/workspace-and-team)
- [Billing](https://scrapfly.io/docs/billing)

### Products

#### MCP Server

- [Getting Started](https://scrapfly.io/docs/mcp/getting-started)
- [Tools & API Spec](https://scrapfly.io/docs/mcp/tools)
- [Authentication](https://scrapfly.io/docs/mcp/authentication)
- [Examples & Use Cases](https://scrapfly.io/docs/mcp/examples)
- [FAQ](https://scrapfly.io/docs/mcp/faq)
##### Integrations

- [Overview](https://scrapfly.io/docs/mcp/integrations)
- [Claude Desktop](https://scrapfly.io/docs/mcp/integrations/claude-desktop)
- [Claude Code](https://scrapfly.io/docs/mcp/integrations/claude-code)
- [ChatGPT](https://scrapfly.io/docs/mcp/integrations/chatgpt)
- [Cursor](https://scrapfly.io/docs/mcp/integrations/cursor)
- [Cline](https://scrapfly.io/docs/mcp/integrations/cline)
- [Windsurf](https://scrapfly.io/docs/mcp/integrations/windsurf)
- [Zed](https://scrapfly.io/docs/mcp/integrations/zed)
- [Roo Code](https://scrapfly.io/docs/mcp/integrations/roo-code)
- [VS Code](https://scrapfly.io/docs/mcp/integrations/vscode)
- [LangChain](https://scrapfly.io/docs/mcp/integrations/langchain)
- [LlamaIndex](https://scrapfly.io/docs/mcp/integrations/llamaindex)
- [CrewAI](https://scrapfly.io/docs/mcp/integrations/crewai)
- [OpenAI](https://scrapfly.io/docs/mcp/integrations/openai)
- [n8n](https://scrapfly.io/docs/mcp/integrations/n8n)
- [Make](https://scrapfly.io/docs/mcp/integrations/make)
- [Zapier](https://scrapfly.io/docs/mcp/integrations/zapier)
- [Vapi AI](https://scrapfly.io/docs/mcp/integrations/vapi)
- [Agent Builder](https://scrapfly.io/docs/mcp/integrations/agent-builder)
- [Custom Client](https://scrapfly.io/docs/mcp/integrations/custom-client)


#### Web Scraping API

- [Getting Started](https://scrapfly.io/docs/scrape-api/getting-started)
- [API Specification]()
- [Monitoring](https://scrapfly.io/docs/monitoring)
- [Customize Request](https://scrapfly.io/docs/scrape-api/custom)
- [Debug](https://scrapfly.io/docs/scrape-api/debug)
- [Anti Scraping Protection](https://scrapfly.io/docs/scrape-api/anti-scraping-protection)
- [Proxy](https://scrapfly.io/docs/scrape-api/proxy)
- [Proxy Mode](https://scrapfly.io/docs/scrape-api/proxy-mode)
- [Proxy Mode - Screaming Frog](https://scrapfly.io/docs/scrape-api/proxy-mode/screaming-frog)
- [Proxy Mode - Apify](https://scrapfly.io/docs/scrape-api/proxy-mode/apify)
- [(Auto) Data Extraction](https://scrapfly.io/docs/scrape-api/extraction)
- [Javascript Rendering](https://scrapfly.io/docs/scrape-api/javascript-rendering)
- [Javascript Scenario](https://scrapfly.io/docs/scrape-api/javascript-scenario)
- [SSL](https://scrapfly.io/docs/scrape-api/ssl)
- [DNS](https://scrapfly.io/docs/scrape-api/dns)
- [Cache](https://scrapfly.io/docs/scrape-api/cache)
- [Session](https://scrapfly.io/docs/scrape-api/session)
- [Webhook](https://scrapfly.io/docs/scrape-api/webhook)
- [Screenshot](https://scrapfly.io/docs/scrape-api/screenshot)
- [Errors](https://scrapfly.io/docs/scrape-api/errors)
- [Timeout](https://scrapfly.io/docs/scrape-api/understand-timeout)
- [Throttling](https://scrapfly.io/docs/throttling)
- [Troubleshoot](https://scrapfly.io/docs/scrape-api/troubleshoot)
- [Billing](https://scrapfly.io/docs/scrape-api/billing)
- [FAQ](https://scrapfly.io/docs/scrape-api/faq)

#### Crawler API

- [Getting Started](https://scrapfly.io/docs/crawler-api/getting-started)
- [API Specification]()
- [Retrieving Results](https://scrapfly.io/docs/crawler-api/results)
- [WARC Format](https://scrapfly.io/docs/crawler-api/warc-format)
- [Data Extraction](https://scrapfly.io/docs/crawler-api/extraction-rules)
- [Webhook](https://scrapfly.io/docs/crawler-api/webhook)
- [Billing](https://scrapfly.io/docs/crawler-api/billing)
- [Errors](https://scrapfly.io/docs/crawler-api/errors)
- [Troubleshoot](https://scrapfly.io/docs/crawler-api/troubleshoot)
- [FAQ](https://scrapfly.io/docs/crawler-api/faq)

#### Screenshot API

- [Getting Started](https://scrapfly.io/docs/screenshot-api/getting-started)
- [API Specification]()
- [Accessibility Testing](https://scrapfly.io/docs/screenshot-api/accessibility)
- [Webhook](https://scrapfly.io/docs/screenshot-api/webhook)
- [Billing](https://scrapfly.io/docs/screenshot-api/billing)
- [Errors](https://scrapfly.io/docs/screenshot-api/errors)

#### Extraction API

- [Getting Started](https://scrapfly.io/docs/extraction-api/getting-started)
- [API Specification]()
- [Rules Template](https://scrapfly.io/docs/extraction-api/rules-and-template)
- [LLM Extraction](https://scrapfly.io/docs/extraction-api/llm-prompt)
- [AI Auto Extraction](https://scrapfly.io/docs/extraction-api/automatic-ai)
- [Webhook](https://scrapfly.io/docs/extraction-api/webhook)
- [Billing](https://scrapfly.io/docs/extraction-api/billing)
- [Errors](https://scrapfly.io/docs/extraction-api/errors)
- [FAQ](https://scrapfly.io/docs/extraction-api/faq)

#### Proxy Saver

- [Getting Started](https://scrapfly.io/docs/proxy-saver/getting-started)
- [Fingerprints](https://scrapfly.io/docs/proxy-saver/fingerprints)
- [Optimizations](https://scrapfly.io/docs/proxy-saver/optimizations)
- [SSL Certificates](https://scrapfly.io/docs/proxy-saver/certificates)
- [Protocols](https://scrapfly.io/docs/proxy-saver/protocols)
- [Pacfile](https://scrapfly.io/docs/proxy-saver/pacfile)
- [Secure Credentials](https://scrapfly.io/docs/proxy-saver/security)
- [Billing](https://scrapfly.io/docs/proxy-saver/billing)

#### Cloud Browser API

- [Getting Started](https://scrapfly.io/docs/cloud-browser-api/getting-started)
- [Proxy & Geo-Targeting](https://scrapfly.io/docs/cloud-browser-api/proxy)
- [Unblock API](https://scrapfly.io/docs/cloud-browser-api/unblock)
- [File Downloads](https://scrapfly.io/docs/cloud-browser-api/file-downloads)
- [Session Resume](https://scrapfly.io/docs/cloud-browser-api/session-resume)
- [Human-in-the-Loop](https://scrapfly.io/docs/cloud-browser-api/human-in-the-loop)
- [Debug Mode](https://scrapfly.io/docs/cloud-browser-api/debug-mode)
- [Bring Your Own Proxy](https://scrapfly.io/docs/cloud-browser-api/bring-your-own-proxy)
- [Browser Extensions](https://scrapfly.io/docs/cloud-browser-api/extensions)
##### Integrations

- [Puppeteer](https://scrapfly.io/docs/cloud-browser-api/puppeteer)
- [Playwright](https://scrapfly.io/docs/cloud-browser-api/playwright)
- [Selenium](https://scrapfly.io/docs/cloud-browser-api/selenium)
- [Vercel Agent Browser](https://scrapfly.io/docs/cloud-browser-api/agent-browser)
- [Browser Use](https://scrapfly.io/docs/cloud-browser-api/browser-use)
- [Stagehand](https://scrapfly.io/docs/cloud-browser-api/stagehand)

- [Billing](https://scrapfly.io/docs/cloud-browser-api/billing)
- [Errors](https://scrapfly.io/docs/cloud-browser-api/errors)


### Tools

- [Antibot Detector](https://scrapfly.io/docs/tools/antibot-detector)

### SDK

- [Golang](https://scrapfly.io/docs/sdk/golang)
- [Python](https://scrapfly.io/docs/sdk/python)
- [Rust](https://scrapfly.io/docs/sdk/rust)
- [TypeScript](https://scrapfly.io/docs/sdk/typescript)
- [Scrapy](https://scrapfly.io/docs/sdk/scrapy)

### Integrations

- [Getting Started](https://scrapfly.io/docs/integration/getting-started)
- [LangChain](https://scrapfly.io/docs/integration/langchain)
- [LlamaIndex](https://scrapfly.io/docs/integration/llamaindex)
- [CrewAI](https://scrapfly.io/docs/integration/crewai)
- [Zapier](https://scrapfly.io/docs/integration/zapier)
- [Make](https://scrapfly.io/docs/integration/make)
- [n8n](https://scrapfly.io/docs/integration/n8n)

### Academy

- [Overview](https://scrapfly.io/academy)
- [Web Scraping Overview](https://scrapfly.io/academy/scraping-overview)
- [Tools](https://scrapfly.io/academy/tools-overview)
- [Reverse Engineering](https://scrapfly.io/academy/reverse-engineering)
- [Static Scraping](https://scrapfly.io/academy/static-scraping)
- [HTML Parsing](https://scrapfly.io/academy/html-parsing)
- [Dynamic Scraping](https://scrapfly.io/academy/dynamic-scraping)
- [Hidden API Scraping](https://scrapfly.io/academy/hidden-api-scraping)
- [Headless Browsers](https://scrapfly.io/academy/headless-browsers)
- [Hidden Web Data](https://scrapfly.io/academy/hidden-web-data)
- [JSON Parsing](https://scrapfly.io/academy/json-parsing)
- [Data Processing](https://scrapfly.io/academy/data-processing)
- [Scaling](https://scrapfly.io/academy/scaling)
- [Walkthrough Summary](https://scrapfly.io/academy/walkthrough-summary)
- [Scraper Blocking](https://scrapfly.io/academy/scraper-blocking)
- [Proxies](https://scrapfly.io/academy/proxies)

---

#  Rust SDK

 [  View as markdown ](https://scrapfly.io/?view=markdown)   Copy for LLM    Copy for LLM  [     Open in ChatGPT ](https://chatgpt.com/?hints=search&prompt=Read%20from%20https%3A%2F%2Fscrapfly.io%2Fdocs%2Fsdk%2Frust%20so%20I%20can%20ask%20questions%20about%20it.) [     Open in Claude ](https://claude.ai/new?q=Read%20from%20https%3A%2F%2Fscrapfly.io%2Fdocs%2Fsdk%2Frust%20so%20I%20can%20ask%20questions%20about%20it.) [     Open in Perplexity ](https://www.perplexity.ai/search/new?q=Read%20from%20https%3A%2F%2Fscrapfly.io%2Fdocs%2Fsdk%2Frust%20so%20I%20can%20ask%20questions%20about%20it.) 

 

 

 The Rust SDK is an async-first client for the Scrapfly [Web Scraping API](https://scrapfly.io/docs/scrape-api/getting-started), [Screenshot API](https://scrapfly.io/docs/screenshot-api/getting-started), [Extraction API](https://scrapfly.io/docs/extraction-api/getting-started) and [Crawler API](https://scrapfly.io/docs/crawler-api/getting-started). It mirrors the shape of the official Python, TypeScript and Go SDKs so you can reuse the same mental model across your stack.

 The SDK is built on `tokio` and `reqwest` with the following design goals:

- Typed builders for every config (`ScrapeConfig`, `ScreenshotConfig`, `ExtractionConfig`, `CrawlerConfig`)
- Single shared `reqwest::Client` with `rustls` TLS — no OpenSSL dependency
- Categorized `ScrapflyError` enum with sentinel variants for upstream 4xx/5xx, rate-limit, quota, crawler cancel/timeout, etc.
- Zero `unwrap()` / `expect()` in library code — every fallible path returns `Result`
- High-level `Crawl` wrapper with `start` / `wait` / `urls` / `read` / `warc` / `har`
- `concurrent_scrape` returns a `Stream` powered by `buffer_unordered`
- No HTML parser bundled — bring your own (`scraper`, `kuchiki`, ...)
 
> Source code and examples live on [GitHub](https://github.com/scrapfly/rust-scrapfly). Full API reference is published on [docs.rs](https://docs.rs/scrapfly-sdk).

## Installation

 Add the crate to your project with `cargo add`. The SDK is async-only and requires a `tokio` runtime.

 ```
cargo add scrapfly-sdk
cargo add tokio --features full
```

 

   

 

## Quick Use

 Here's a minimal end-to-end example:

 ```
use scrapfly_sdk::{Client, ScrapeConfig};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::error="">> {
    let client = Client::builder()
        .api_key("{{ YOUR_API_KEY }}")
        .build()?;

    let cfg = ScrapeConfig::builder("https://web-scraping.dev/product/1")
        .asp(true)             // enable scraper blocking bypass
        .country("us")         // set proxy country
        .render_js(true)       // enable headless browser
        .build()?;

    let result = client.scrape(&cfg).await?;

    // 1) access scraped HTML content
    println!("{}", result.result.content);
    // 2) inspect status code, metadata, response headers
    println!("status = {}", result.result.status_code);
    Ok(())
}</dyn>
```

 

   

 

 Create a client with `Client::builder().api_key(...).build()?`, then call `client.scrape(&cfg).await?` with a `ScrapeConfig` built from the fluent builder. The returned `ScrapeResult` contains the page content, status code, response headers, browser data (when `render_js` is on) and full request metadata.

 The Rust SDK deliberately does **not** bundle an HTML parser. Pair it with [`scraper`](https://crates.io/crates/scraper) or any crate you already use for CSS/XPath selection.

## Configuring Scrape

 `ScrapeConfig` uses a fluent builder that exposes every feature of the Scrapfly Web Scraping API. Chain the methods you need and call `.build()?` at the end:

> For scraping websites protected against web scraping **make sure to enable [Anti Scraping Protection bypass ](https://scrapfly.io/docs/onboarding#asp)** using `.asp(true)`.

 ```
use scrapfly_sdk::{Client, ScrapeConfig, HttpMethod};
use std::collections::BTreeMap;

let mut headers = BTreeMap::new();
headers.insert("X-Csrf-Token".to_string(), "1234".to_string());

let cfg = ScrapeConfig::builder("https://web-scraping.dev/product/1")
    // Request details
    .method(HttpMethod::Get)
    .headers(headers)

    // enable scraper blocking bypass (recommended)
    .asp(true)
    .country("us,ca,fr")

    // enable cache (recommended when developing)
    .cache(true)
    .cache_ttl(3600)          // expire cache in 1h (default 24h)
    .debug(true)              // enable debug info in dashboard

    // enable javascript rendering
    .render_js(true)
    .wait_for_selector(".review")
    .rendering_wait(5000)     // 5 seconds
    .js("return document.title")
    .auto_scroll(true)

    .build()?;

let result = client.scrape(&cfg).await?;
```

 

   

 

 For the full list of options, see the [API specification](https://scrapfly.io/docs/scrape-api/getting-started#spec) — the builder method names match the API parameter names one-to-one.

## Handling Result

 `ScrapeResult` contains the response, scrape metadata and (when `render_js` is enabled) browser data as `serde_json::Value`:

 ```
let result = client.scrape(&cfg).await?;

// response body (HTML) and status code
let _html: &str     = &result.result.content;
let _status: u16    = result.result.status_code;

// response headers
let _headers = &result.result.response_headers;

// log url to view this scrape in your Scrapfly dashboard
let _log_url = &result.result.log_url;

// if render_js was on, browser_data is populated (serde_json::Value)
let _js_result = &result.result.browser_data["javascript_evaluation_result"];
// collected iframe content (separate field on ResultData)
let _iframes = &result.result.iframes;
```

 

   

 

## Concurrent Scraping

 `Client::concurrent_scrape` takes a `Vec<ScrapeConfig>` plus a concurrency limit and returns a `Stream` of results. Consume it with `StreamExt::next` or any stream combinator:

 ```
use futures_util::StreamExt;
use scrapfly_sdk::{Client, ScrapeConfig};

let configs: Vec<scrapeconfig> = (1..=5)
    .map(|i| {
        ScrapeConfig::builder(format!("https://httpbin.dev/anything?i={}", i))
            .build()
            .expect("build")
    })
    .collect();

let mut stream = client.concurrent_scrape(configs, 3); // 3 in-flight at a time
while let Some(result) = stream.next().await {
    match result {
        Ok(r)  => println!("ok: {}", r.result.status_code),
        Err(e) => eprintln!("err: {}", e),
    }
}</scrapeconfig>
```

 

   

 

## Getting Account Details

 Use `client.account()` to fetch subscription info and remaining quota:

 ```
let account = client.account().await?;
println!("{:#?}", account);
```

 

   

 

## Examples

### Custom Headers

 Pass request headers via `.headers(...)` on the builder. When `asp=true`, Scrapfly may add extra headers automatically to bypass anti-bot protection.

 ```
use std::collections::BTreeMap;

let mut headers = BTreeMap::new();
headers.insert("X-My-Header".to_string(), "foo".to_string());

let cfg = ScrapeConfig::builder("https://httpbin.dev/headers")
    .headers(headers)
    .build()?;

let res = client.scrape(&cfg).await?;
println!("{}", res.result.content);
```

 

   

 

### POST Form

 To POST form data, set the method and body, and set the `content-type` header:

 ```
use std::collections::BTreeMap;
use scrapfly_sdk::HttpMethod;

let mut headers = BTreeMap::new();
headers.insert(
    "content-type".to_string(),
    "application/x-www-form-urlencoded".to_string(),
);

let cfg = ScrapeConfig::builder("https://httpbin.dev/post")
    .method(HttpMethod::Post)
    .headers(headers)
    .body("foo=bar&baz=qux")
    .build()?;

let res = client.scrape(&cfg).await?;
println!("{}", res.result.content);
```

 

   

 

### POST JSON

 To POST JSON, serialize your payload with `serde_json` and set the `content-type` header:

 ```
use std::collections::BTreeMap;
use scrapfly_sdk::HttpMethod;
use serde_json::json;

let mut headers = BTreeMap::new();
headers.insert(
    "content-type".to_string(),
    "application/json".to_string(),
);

let body = json!({"foo": "bar"}).to_string();

let cfg = ScrapeConfig::builder("https://httpbin.dev/post")
    .method(HttpMethod::Post)
    .headers(headers)
    .body(body)
    .build()?;

let res = client.scrape(&cfg).await?;
println!("{}", res.result.content);
```

 

   

 

### JavaScript Rendering

 To render pages with a headless browser using [JavaScript Rendering](https://scrapfly.io/docs/scrape-api/javascript-rendering#spec), enable `render_js`:

 ```
let cfg = ScrapeConfig::builder("https://web-scraping.dev/product/1")
    .render_js(true)
    .wait_for_selector(".review")  // wait for element to appear
    .rendering_wait(5000)           // or wait a fixed amount of time
    .build()?;

let res = client.scrape(&cfg).await?;
println!("{}", res.result.content);
```

 

   

 

### JavaScript Scenario

 To run a [JavaScript Scenario](https://scrapfly.io/docs/scrape-api/javascript-scenario), pass a serialized scenario via `.js_scenario(...)`. The Rust SDK accepts any `serde_json::Value`, so you can build the scenario with the `json!` macro:

 ```
use serde_json::json;

let scenario = json!([
    { "wait_for_selector": { "selector": ".review", "timeout": 5000 } },
    { "click": { "selector": "#load-more-reviews" } },
    { "wait_for_navigation": { "timeout": 10000 } },
    { "execute": { "script": "return document.title" } }
]);

let cfg = ScrapeConfig::builder("https://web-scraping.dev/product/1")
    .render_js(true)
    .debug(true)
    .js_scenario(scenario)
    .build()?;

let res = client.scrape(&cfg).await?;
// browser_data is a serde_json::Value — index into it for scenario results
println!("{:#?}", res.result.browser_data["js_scenario"]);
```

 

   

 

### Scraping Binary Data

 Binary responses (images, PDFs, ...) are returned as **base64-encoded** strings in `result.content`. Decode them with the `base64` crate:

 ```
use base64::{engine::general_purpose, Engine as _};
use std::fs;

let cfg = ScrapeConfig::builder(
    "https://web-scraping.dev/assets/products/orange-chocolate-box-small-1.webp",
)
.build()?;

let res = client.scrape(&cfg).await?;
let bytes = general_purpose::STANDARD.decode(&res.result.content)?;
fs::write("image.webp", bytes)?;
```

 

   

 

### Full Documentation

 For the full, always-up-to-date Rust API reference (every method, every type, every enum variant) see the [docs.rs page](https://docs.rs/scrapfly-sdk).

## Screenshot API

 Use `client.screenshot()` with a `ScreenshotConfig`. The result contains the raw image bytes in `result.image`, plus a `save()` helper that writes the image to disk with the right extension:

 ```
use scrapfly_sdk::{Client, ScreenshotConfig};

let cfg = ScreenshotConfig::builder("https://web-scraping.dev/product/1")
    .build()?;

let result = client.screenshot(&cfg).await?;
let path = result.save("product", None)?;
println!("saved: {} ({} bytes)", path.display(), result.image.len());
```

 

   

 

## Extraction API

 Use `client.extract()` with an `ExtractionConfig`. You can use a predefined AI model, a saved template, or a free-form LLM prompt:

 ```
use scrapfly_sdk::{Client, ExtractionConfig, ExtractionModel};

// Predefined AI model
let html = b"...<h1>Orange Chocolate Box</h1><span>$9.99</span>...".to_vec();
let cfg = ExtractionConfig::builder(html, "text/html")
    .extraction_model(ExtractionModel::Product)
    .build()?;

let result = client.extract(&cfg).await?;
println!("{}", serde_json::to_string_pretty(&result.data)?);

// Free-form LLM prompt
let html2 = b"...<p>GPU: RTX 5090, 24 GB VRAM...</p>...".to_vec();
let cfg2 = ExtractionConfig::builder(html2, "text/html")
    .extraction_prompt("Extract GPU name, clock speed and VRAM as JSON")
    .build()?;

let result2 = client.extract(&cfg2).await?;
println!("{}", serde_json::to_string_pretty(&result2.data)?);
```

 

   

 

## Crawler API

 Use the high-level `Crawl` wrapper for a simple start/wait/read flow. It owns the crawl UUID and exposes `start`, `wait`, `status`, `urls`, `read`, `warc` and `har` helpers. [web-scraping.dev](https://web-scraping.dev) is used as the target in the examples below — it is a public sandbox that welcomes automated crawls.

 ```
use std::time::Duration;
use scrapfly_sdk::{Client, Crawl, CrawlerConfig, WaitOptions};

let config = CrawlerConfig::builder("https://web-scraping.dev/products")
    .page_limit(10)
    .max_depth(2)
    .asp(true)
    .build()?;

let mut crawl = Crawl::new(&client, config);
crawl.start().await?;
println!("uuid = {}", crawl.uuid());

crawl
    .wait(WaitOptions {
        poll_interval: Duration::from_secs(3),
        max_wait: Some(Duration::from_secs(300)),
        ..Default::default()
    })
    .await?;

let status = crawl.status(false).await?;
println!("visited {} urls", status.state.urls_visited);
```

 

   

 

### List Crawled URLs

 Paginate through the list of URLs the crawler visited, skipped, or failed on. Increment `page` until the response is empty:

 ```
// Default status filter is "visited"
let visited = crawl.urls(Some("visited"), 1, 100).await?;
for entry in &visited.urls {
    println!("{}", entry.url);
}

// Failed URLs include the reason on each entry
let failed = crawl.urls(Some("failed"), 1, 100).await?;
for entry in &failed.urls {
    println!("{} -> {:?}", entry.url, entry.reason);
}
```

 

   

 

### Read a Single Page's Content

 `crawl.read(url, format)` fetches the rendered content of one page in plain mode (no JSON envelope). Use `crawl.read_string(...)` for the string-only shortcut, or `crawl.read_batch(...)` to fetch up to 100 URLs in a single round-trip:

 ```
use scrapfly_sdk::CrawlerContentFormat;

// Single URL — returns raw bytes + metadata
let content = crawl
    .read("https://web-scraping.dev/products", CrawlerContentFormat::Markdown)
    .await?;
if let Some(c) = content {
    println!("{}", &c.content[..200.min(c.content.len())]);
}

// Batch read (≤100 URLs per call)
let batch = crawl
    .read_batch(
        &[
            "https://web-scraping.dev/products".to_string(),
            "https://web-scraping.dev/product/1".to_string(),
        ],
        &[CrawlerContentFormat::Markdown],
    )
    .await?;
for (url, formats) in &batch {
    for (format, body) in formats {
        println!("{} [{}] -> {} chars", url, format, body.len());
    }
}
```

 

   

 

### Download WARC and HAR Artifacts

 WARC archives every HTTP exchange (request + response + body) on the wire. HAR captures network timings, headers and response bodies in a JSON-friendly format. Both are returned as raw bytes you can save, stream, or feed into your own parser:

 ```
// WARC: raw bytes
let warc = crawl.warc().await?;
println!("WARC: {} bytes", warc.data.len());
std::fs::write("crawl.warc.gz", &warc.data)?;

// HAR: same shape, JSON-friendly
let har = crawl.har().await?;
std::fs::write("crawl.har", &har.data)?;
```

 

   

 

### Cancel a Running Crawl

 Stop a crawler before it reaches its natural end. The status will transition to `CANCELLED` with `state.stop_reason = "user_cancelled"`:

 ```
// ... crawl.start().await? ...

// Later, from a signal handler, a timer, or another task:
crawl.cancel().await?;

// AllowCancelled lets `wait` return Ok on a cancellation we triggered ourselves,
// instead of bubbling ScrapflyError::CrawlerCancelled.
crawl
    .wait(WaitOptions {
        allow_cancelled: true,
        ..Default::default()
    })
    .await?;

let status = crawl.status(true).await?;
println!("status: {:?}", status.status);
if let Some(reason) = &status.state.stop_reason {
    println!("stop_reason: {}", reason);
}
```

 

   

 

## Cloud Browser API

 The [Cloud Browser API](https://scrapfly.io/docs/cloud-browser-api/getting-started) gives you a live WebSocket connection to a real, pre-configured browser running on Scrapfly infrastructure — ideal for driving [`chromiumoxide`](https://github.com/mattsse/chromiumoxide) or any other [Chrome DevTools Protocol](https://chromedevtools.github.io/devtools-protocol/) client from Rust. The Rust SDK exposes two entry points:

- `client.cloud_browser_url(&BrowserConfig { .. })` — builds a `wss://` URL you can pass directly to a CDP client
- `client.cloud_browser_unblock(&UnblockConfig { .. }).await?` — runs the anti-bot bypass flow first and returns a `UnblockResult` with the `ws_url` of the primed session
 
 Unlike the other configs in this SDK, `BrowserConfig` and `UnblockConfig` are plain structs with public fields — use struct-literal syntax with `..Default::default()` rather than a fluent builder.

### Build a Cloud Browser WebSocket URL

 Use `cloud_browser_url(...)` when you want to drive a fresh browser session yourself, with full control over the navigation:

 ```
use scrapfly_sdk::{BrowserConfig, Client};

let client = Client::builder().api_key("{{ YOUR_API_KEY }}").build()?;

let ws_url = client.cloud_browser_url(&BrowserConfig {
    country: Some("us".into()),
    os: Some("linux".into()),
    block_images: true,
    block_fonts: true,
    cache: true,
    ..Default::default()
});

// Hand `ws_url` off to your CDP client of choice, e.g. chromiumoxide:
//
//   use chromiumoxide::Browser;
//   let (browser, mut handler) = Browser::connect(&ws_url).await?;
//   tokio::spawn(async move { while handler.next().await.is_some() {} });
//   let page = browser.new_page("https://web-scraping.dev/product/1").await?;
//   let title = page.get_title().await?;
println!("ws_url = {}", ws_url);
```

 

   

 

### Unblock a Target and Get a Primed Session

 `cloud_browser_unblock(&UnblockConfig { .. })` calls `POST /unblock`, runs Scrapfly's ASP flow against the target URL, and returns a `UnblockResult` containing the `ws_url` of a browser session that is already past the anti-bot challenge. You then connect to `ws_url` with your CDP client and start driving the primed page:

 ```
use scrapfly_sdk::{Client, UnblockConfig};

let client = Client::builder().api_key("{{ YOUR_API_KEY }}").build()?;

let result = client
    .cloud_browser_unblock(&UnblockConfig {
        url: "https://web-scraping.dev/product/1".into(),
        country: Some("us".into()),
        timeout: Some(30_000),          // navigation timeout (ms)
        browser_timeout: Some(120_000), // session timeout (ms)
    })
    .await?;

println!("ws_url     = {}", result.ws_url);
println!("session_id = {}", result.session_id);
println!("run_id     = {}", result.run_id);

// `result.ws_url` is a primed CDP endpoint — connect with chromiumoxide (or
// any other CDP client) and start scraping the page that already got past
// the anti-bot challenge:
//
//   use chromiumoxide::Browser;
//   let (browser, mut handler) = Browser::connect(&result.ws_url).await?;
//   tokio::spawn(async move { while handler.next().await.is_some() {} });
//   let page = browser.pages().await?.into_iter().next().unwrap();
//   let html = page.content().await?;

// When you're done, stop the session to release the browser back to the pool.
client.cloud_browser_session_stop(&result.session_id).await?;
```

 

   

 

 Always call `client.cloud_browser_session_stop(&session_id).await?` when you are done — otherwise the session stays alive until the server-side `browser_timeout` expires and keeps billing against your plan.

## Error Handling

 Every fallible call returns `Result<T, ScrapflyError>`. The `ScrapflyError` enum has sentinel variants so you can pattern-match on the exact failure category instead of parsing error messages:

- `Transport` — network / TCP / TLS failure (`reqwest::Error`)
- `Json` — failed to serialize or deserialize API payload
- `BadApiKey` — `401 Unauthorized` from Scrapfly
- `ApiClient(ApiError)` / `ApiServer(ApiError)` — `4xx` / `5xx` from Scrapfly itself
- `UpstreamClient(ApiError)` / `UpstreamServer(ApiError)` — the target website responded with `4xx` / `5xx`
- `TooManyRequests` / `QuotaLimitReached` — rate-limited or out of credits
- `AspBypassFailed` / `ProxyFailed` / `ScrapeFailed` — scrape-level failures
- `CrawlerFailed` / `CrawlerCancelled` / `CrawlerTimeout` — crawler-specific
 
 The `ApiError` struct exposes `code`, `message`, `http_status`, `documentation_url`, `hint` and `retry_after_ms` — enough to drive retry logic or surface a precise message to the user:

 ```
use scrapfly_sdk::{Client, ScrapeConfig, ScrapflyError};

let cfg = ScrapeConfig::builder("https://httpbin.dev/status/403").build()?;

match client.scrape(&cfg).await {
    Ok(result) => println!("ok: {}", result.result.status_code),
    Err(ScrapflyError::UpstreamClient(e)) => {
        // Target website responded 4xx
        eprintln!("upstream 4xx: {} ({})", e.http_status, e.code);
    }
    Err(ScrapflyError::TooManyRequests(e)) => {
        // Honor retry_after_ms from the error body
        eprintln!("rate limited, retry after {} ms", e.retry_after_ms);
    }
    Err(ScrapflyError::QuotaLimitReached(_)) => {
        eprintln!("out of scrape credits — upgrade your plan");
    }
    Err(e) => eprintln!("other error: {}", e),
}
```

 

   

 

## Resources

- [crates.io — scrapfly-sdk](https://crates.io/crates/scrapfly-sdk)
- [docs.rs — API reference](https://docs.rs/scrapfly-sdk)
- [GitHub — source &amp; examples](https://github.com/scrapfly/rust-scrapfly)
- [Web Scraping API reference](https://scrapfly.io/docs/scrape-api/getting-started)
- [Screenshot API reference](https://scrapfly.io/docs/screenshot-api/getting-started)
- [Extraction API reference](https://scrapfly.io/docs/extraction-api/getting-started)
- [Crawler API reference](https://scrapfly.io/docs/crawler-api/getting-started)
- [Cloud Browser API reference](https://scrapfly.io/docs/cloud-browser-api/getting-started)