 # LlamaIndex

 Power up LLM with web scraping

##  LlamaIndex + Scrapfly. Web data, no code. 

- **Official integration.** Native LlamaIndex app maintained by Scrapfly, always current with the latest API features.
- **Every Scrapfly product.** Web Scraping, Extraction, Screenshot, and MCP, one connection, all capabilities.
- **No code required.** Drop Scrapfly actions into your LlamaIndex workflow; connect to 7,000+ other apps.
 
 [See on LlamaIndex](https://scrapfly.io/docs/integration/llamaindex) [Integration docs](https://scrapfly.io/docs/integration/llamaindex) 

 1,000 free Scrapfly credits. No credit card required. 

 

 [  ](https://scrapfly.io/docs/integration/llamaindex) // AT A GLANCE**OAuth**one-time auth

**4**actions

0code required

 

 



 

 

 

 

---

 COVERAGE## LlamaIndex + Every Scrapfly Product

One integration. Four product surfaces. Same engine powering 5B+ monthly scrapes, exposed as LlamaIndex actions.

 

 ### Every Scrapfly API. Inside LlamaIndex.

One authentication, four product surfaces. Scrape any page, extract structured data, capture screenshots, or drive an agent, all from your LlamaIndex workflow. No SDK install, no secrets rotation plumbing.

**4**scrapfly APIs exposed

**asp=True**one flag bypasses 8 anti-bot vendors

**1,000**free credits to evaluate

0charges on failed requests

 

 



 

 

 ### Web Scraping API, Any Page, Any Vendor

Drop a *Scrape* action into your LlamaIndex workflow. Pass a URL, Scrapfly handles the request, stealth Chromium via [Scrapium](https://scrapfly.io/scrapium), byte-perfect Chrome TLS via [Curlium](https://scrapfly.io/curlium), Cloudflare/Akamai/DataDome bypass via `asp=True`.

JS rendering

Cookie + session

8 anti-bot vendors

Geo proxy rotation

[ASP docs](https://scrapfly.io/docs/scrape-api/anti-scraping-protection)

[Bypass coverage](https://scrapfly.io/bypass)

 

 



 

 ### Why Hand-Rolled Scrapers Fail

Teams that cobble together scraping inside LlamaIndex using HTTP actions + headless browsers hit a wall within weeks.

 | **HTTP action** | blocked at TLS |
|---|---|
| **Headless browser add-on** | fingerprint leaks |
| **Custom code step** | breaks on every vendor update |
| **Puppeteer service** | no anti-bot bypass |
| **Scrapfly** | tracked daily, 94–98% |

 



 

 

 ### Screenshot API

Capture any page, full-page, element, or viewport. PNG, JPEG, WebP. Ads + pop-ups auto-suppressed.

 [Screenshot API](https://scrapfly.io/products/screenshot-api) 

 



 

 ### Extraction API, Structured Data via Prompt or Schema

Turn HTML into JSON inside your LlamaIndex workflow. LLM prompt or schema validation, deterministic output envelope every time.

LLM prompt extract

Schema validation

Auto extract

[Extraction API](https://scrapfly.io/products/extraction-api)

 

 



 

 ### MCP Server

Natural-language control via Claude, ChatGPT, or LlamaIndex AI steps. No zap configuration for the model.

 [MCP docs](https://scrapfly.io/docs/mcp/getting-started) 

 



 

 

 ### Workflow Recipes

Common LlamaIndex + Scrapfly patterns you can copy today.

  **Daily scrape → AI extract → Slack** timed trigger, Scrape action, Extraction with prompt, Slack channel post 

 

  **Google Sheet row → Scrape → append row** row-added trigger, Scrape, Auto-extract, write back to the sheet 

 

  **Monitor competitor page → Screenshot → email diff** hourly trigger, Screenshot, image diff, trigger email on change 

 

 

 



 

 ### When to Pick LlamaIndex vs SDK

LlamaIndex wins on orchestration breadth. The SDK wins on tight loops + custom logic.

 | **Under 100k req/mo** | LlamaIndex |
|---|---|
| **Needs other SaaS triggers** | LlamaIndex |
| **Tight scraping loop** | SDK |
| **Complex branching logic** | SDK |
| **CI / cron pipelines** | SDK |

 [Python SDK](https://scrapfly.io/docs/sdk/python) 



 

 

 

---

 SETUP## Scrape from LlamaIndex in Five Minutes

Four steps. No code.

 

 [  // STEP 1### Create a free Scrapfly Account

try for free and get your API key from your scrapfly dashboard

 



 ](https://scrapfly.io/register) 

 [  // STEP 2### Install Python Packages

use `pip install llama-index scrapfly-sdk` to ready your workspace

 



 ](https://scrapfly.io/docs/integration/llamaindex#usage) 

 [  // STEP 3### See Some Usage Examples!

our docs contain pre-made examples and tips on how to optimize your llamaindex programs

 



 ](https://scrapfly.io/docs/integration/llamaindex#examples) 

 

 

---

 PROOF## Same API, Direct or via LlamaIndex

Under the hood, LlamaIndex actions call the same Scrapfly endpoint. Prefer code? Pick a language, every example targets the same `asp=True` path.

 

Set `asp=True` and Scrapfly handles Cloudflare, Akamai, DataDome, and five other vendors. See the full [bypass catalog](https://scrapfly.io/bypass).

     Python TypeScript Go Rust  

     

 ```
from scrapfly import ScrapeConfig, ScrapflyClient, ScrapeApiResponse
client = ScrapflyClient(key="API KEY")

api_response: ScrapeApiResponse = client.scrape(
    ScrapeConfig(
        url='https://httpbin.dev/html',
        # bypass anti-scraping protection
        asp=True
    )
)
print(api_response.result)
```

 ```
import { 
    ScrapflyClient, ScrapeConfig 
} from 'jsr:@scrapfly/scrapfly-sdk';

const client = new ScrapflyClient({ key: "API KEY" });
let api_result = await client.scrape(
    new ScrapeConfig({
        url: 'https://httpbin.dev/html',
        // bypass anti-scraping protection
        asp: true,
    })
);
console.log(api_result.result);
```

 ```
package main

import (
	"fmt"
	"github.com/scrapfly/go-scrapfly"
)

func main() {
	client, _ := scrapfly.New("API KEY")
	result, _ := client.Scrape(&scrapfly.ScrapeConfig{
		URL: "https://httpbin.dev/html",
		// bypass anti-scraping protection
		ASP: true,
	})
	fmt.Println(result.Result.Content)
}
```

 ```
use scrapfly_sdk::{Client, ScrapeConfig};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = Client::builder().api_key("API KEY").build()?;

    let cfg = ScrapeConfig::builder("https://httpbin.dev/html")
        // bypass anti-scraping protection
        .asp(true)
        .build()?;

    let result = client.scrape(&cfg).await?;
    println!("{}", result.result.content);
    Ok(())
}
```

 

 

 [ Python SDK docs → ](https://scrapfly.io/docs/sdk/python) [ TypeScript SDK docs → ](https://scrapfly.io/docs/sdk/typescript) [ Go SDK docs → ](https://scrapfly.io/docs/sdk/golang) [ Rust SDK docs → ](https://scrapfly.io/docs/sdk/rust) 

 

 

 

---

  FAQ## Frequently Asked Questions

 

  ### Is the LlamaIndex integration free?

 Yes. The integration itself is free; you pay only for Scrapfly credits consumed. The free plan includes 1,000 credits with no credit card required, which is enough to evaluate the integration against your exact targets. Failed requests are not charged.

 

  

 

  ---

 // SEE ALSO### Scrapfly plugs into every major automation platform.

Same four APIs, same `asp=True` bypass, same fingerprint coherence. Switch the orchestrator, keep the engine.

 [Zapier](https://scrapfly.io/integration/zapier) 

 [Make](https://scrapfly.io/integration/make) 

 [n8n](https://scrapfly.io/integration/n8n) 

 [LangChain](https://scrapfly.io/integration/langchain) 

 [CrewAI](https://scrapfly.io/integration/crewai) 

 

 

 [Get Free API Key](https://scrapfly.io/register) [Prefer SDK? Python / TS / Go](https://scrapfly.io/docs/sdk/python)