MCP FAQ

View as markdown

Find quick answers to common questions about the Scrapfly MCP Server. Use the tag filter on the right to browse by category.

Looking for Scrape API documentation? If you're using Scrapfly's REST API directly (not through MCP), see the Scrape API FAQ instead.

What is MCP and why should I use it?

The Model Context Protocol (MCP) is an open standard by Anthropic that enables AI models to interact with external tools and live data sources. Instead of relying solely on training data, AI can make real-time requests and receive structured responses.

Scrapfly's MCP server gives your AI direct access to web data through our production-grade scraping infrastructure-bypassing anti-bot protections, rendering JavaScript, and managing proxies automatically.

How is MCP different from using Scrapfly's API directly?

MCP provides a standardized, AI-native interface. Key differences:

Automated protocol handling – Tool discovery, validation, and error handling built-in
AI reasoning – Your AI determines which tool to use without custom integration code
Natural language – Simply ask your AI; it handles the technical details
Universal compatibility – Works identically across Claude Desktop, Cursor, Cline, and custom clients

Think of it as "AI-native" vs. "AI-compatible."

Which AI models and clients support MCP?

MCP is client-agnostic. Any AI model can use MCP tools if the client supports the protocol:

Claude Desktop – Native MCP support from Anthropic
Cursor – Full MCP integration
Cline (formerly Claude Dev) – VS Code extension with MCP
Custom clients – Build your own using the MCP SDK

Is the MCP server open source?

Yes! The Scrapfly MCP server is fully open source on GitHub.

Self-host on your infrastructure
Contribute features and improvements
Fork for custom needs
Audit code for security and compliance

How do I get started?

Get an API key – Sign up at scrapfly.io/register (1,000 free credits)
Configure your MCP client – Add Scrapfly MCP server to your config file
Restart your client – Claude Desktop, Cursor, etc.
Start scraping – Ask your AI to scrape any website!

See the Getting Started Guide for detailed instructions.

Where is the MCP configuration file located?

Claude Desktop:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json

Cursor:

Settings → Features → MCP Servers
Or edit .cursor/mcp.json in your workspace

My MCP setup isn't working. What should I check?

Common troubleshooting steps:

Verify API key – Check for typos, ensure key is active
Validate JSON syntax – No trailing commas, proper quotes
Restart client – Completely quit and relaunch
Check logs – Look for error messages
Test connection – Run npx mcp-remote https://mcp.scrapfly.io/mcp?key=YOUR_KEY
Verify Node.js – Ensure npx/Node.js is installed

Can I use multiple Scrapfly projects?

Yes! Configure multiple MCP servers with different API keys:

{
  "mcpServers": {
    "scrapfly-production": {
      "command": "npx",
      "args": ["mcp-remote", "https://mcp.scrapfly.io/mcp?key=PROD_KEY"]
    },
    "scrapfly-development": {
      "command": "npx",
      "args": ["mcp-remote", "https://mcp.scrapfly.io/mcp?key=DEV_KEY"]
    }
  }
}

When should I use OAuth2 vs API key authentication?

Scenario	Recommended Method
Personal laptop, local development	Query Parameter (API key in URL)
Production deployment	OAuth2
Remote MCP server	OAuth2
Shared/team environment	OAuth2
Custom programmatic client	Authorization Header

See the Authentication Guide for details.

How does the OAuth2 flow work?

Your MCP client connects without an API key
Server initiates OAuth2 flow and provides authorization URL
You open the URL in your browser and log in to Scrapfly
You grant permission for MCP access
Server receives token and stores it securely
Future requests use the stored token automatically

Tokens automatically refresh before expiration, and you can revoke access anytime from your dashboard.

How often should I rotate API keys?

With OAuth2: Token rotation is automatic-no manual intervention needed.

With API keys: We recommend rotating keys every 90 days as a security best practice. Generate new keys in your dashboard.

When should I use `web_get_page` vs `web_scrape`?

Use web_get_page when:

You just need the page content quickly
Simple scraping without complex interactions
Cost optimization is important (uses fewer credits)

Use web_scrape when:

You need browser automation (clicks, form fills, scrolling)
Authentication/login flows are required
Custom headers, cookies, or POST requests needed
Multiple screenshots per request
Complex LLM-powered data extraction

What is the `pow` parameter?

pow stands for "Proof of Work" and is a required parameter for scraping tools. It's obtained by calling the scraping_instruction_enhanced tool first:

AI calls scraping_instruction_enhanced
Tool returns instructions + POW value
AI uses POW value in web_get_page or web_scrape calls

This ensures AI models follow best practices and understand rate limits before making requests.

What extraction models are available?

Pre-trained extraction models for common data types:

product / product_listing – E-commerce products
article – News articles and blog posts
review_list – Customer reviews
job_posting / job_listing – Job postings
real_estate_property / real_estate_property_listing – Property listings
hotel / hotel_listing – Hotel information
event – Event details
organization – Company/org information
social_media_post – Social media content
And more!

See the Tools documentation for the complete list.

Can I use custom extraction prompts?

Yes! Use the extraction_prompt parameter in web_scrape:

{
  "tool": "web_scrape",
  "parameters": {
    "url": "https://web-scraping.dev",
    "pow": "...",
    "extraction_prompt": "Extract all product names, prices, and availability status as a JSON array"
  }
}

Scrapfly's LLM will process the page and extract data according to your custom prompt.

Are there rate limits?

Yes, Scrapfly has concurrency limits that vary by subscription plan. Concurrency measures the number of simultaneous requests that can be in flight at once.

You can check your current concurrency limits and usage using the info_account MCP tool, or by viewing your plan details on the pricing page.

See the Scrape API FAQ for more details about concurrency and how it works.

What's included in the free tier?

New accounts get 1,000 free API credits to test all features. No credit card required.

A simple page fetch typically costs 1-3 credits. Advanced features (JavaScript rendering, residential proxies, etc.) cost more. See the pricing page for details.

How much does each request cost?

Feature	Credit Cost
Base scrape request	1-3 credits
JavaScript rendering	+5 credits
Anti-scraping protection (ASP)	+10-30 credits (variable)
Residential proxies	+25 credits
Screenshot	+5 credits each
LLM extraction	+10-50 credits (variable)

Actual cost is returned in every response. Check the billing documentation for details.

How can I optimize costs?

Use web_get_page instead of web_scrape for simple requests
Disable render_js for static pages
Start with datacenter proxies – Only escalate to residential if blocked
Enable caching – Set cache: true for frequently accessed pages
Use extraction models – Pre-trained models are more efficient than custom prompts
Check scraping_instruction_enhanced – Get optimal configurations

How do I track my usage?

MCP tool: Call info_account to get real-time usage stats
Dashboard: View detailed usage in your Scrapfly dashboard
Response headers: Every scraping response includes cost in headers

My scraping request failed. What should I do?

Check the error response for details. Common issues:

ERR::ASP::SHIELD_PROTECTION_FAILED – Website has strong anti-bot protection. Try residential proxies or contact support
ERR::SCRAPE::QUOTA_LIMIT_REACHED – You've hit your credit limit. Upgrade or wait for reset
ERR::SCRAPE::TOO_MANY_CONCURRENT_REQUEST – Too many simultaneous requests. Wait or upgrade plan
ERR::SCRAPE::OPERATION_TIMEOUT – Request took too long. Increase timeout or use async webhooks

See the Scrape API Error Documentation for a complete list of error codes and solutions.

I'm getting blocked by websites. How can I avoid this?

Enable ASP – Anti-scraping protection is on by default in web_get_page
Use residential proxies – Set proxy_pool: "public_residential_pool"
Rotate country – Try different proxy locations
Adjust timing – Add rendering_wait to appear more human-like
Check robots.txt – Respect website scraping policies

JavaScript content isn't rendering. What's wrong?

Ensure render_js: true is set. Also check:

Add rendering_wait – Give page time to load
Use wait_for_selector – Wait for specific elements
Check network inspector – Some sites block headless browsers
Enable ASP – Helps bypass headless browser detection

Requests are slow. How can I speed them up?

Disable unnecessary features – Turn off render_js if not needed
Use datacenter proxies – Faster than residential
Reduce rendering_wait – Don't wait longer than necessary
Use webhooks – For long-running requests, get results asynchronously
Request multiple pages in parallel – MCP supports concurrent tool calls

Is my scraped data stored by Scrapfly?

Scrape requests and responses are temporarily logged for debugging and monitoring (visible in your dashboard). Log retention periods vary by subscription plan-see the Monitoring documentation for details.

Privacy Protection: Scrapfly automatically filters sensitive data from logs. Passwords, tokens, API keys, emails, and other credentials matching common patterns (e.g., password, token, secret, auth) are automatically redacted and replaced with <SECRET> in stored logs.

You can disable logging entirely by adjusting settings in Project Settings.

How secure are API keys?

API keys are transmitted over HTTPS and stored securely. Best practices:

Never commit keys to version control
Use OAuth2 for production and shared environments
Rotate keys regularly (every 90 days)
Use project-specific keys to limit scope
Monitor usage for anomalies

Yes. Scrapfly is GDPR compliant. We:

Process data according to our Privacy Policy
Support data deletion requests
Provide data export functionality
Use EU-based infrastructure for EU customers (available on request)

Can I self-host the MCP server?

Yes! The MCP server is fully open source. Self-hosting steps:

Clone the repository: git clone https://github.com/scrapfly/scrapfly-mcp
Install dependencies: npm install
Configure your Scrapfly API key
Run the server: npm start

See the GitHub README for detailed instructions.

Can I add custom tools to the MCP server?

Yes! Since the server is open source, you can:

Fork the repository
Add your custom tool implementations
Register new tools in the MCP server
Self-host with your custom tools

Contributions are welcome! Submit a pull request to share tools with the community.

How do webhooks work with MCP?

For long-running requests, use webhooks to get results asynchronously:

Include webhook: "https://your-endpoint.com/callback" in your request
Scrapfly returns immediately with request ID
When scraping completes, Scrapfly POSTs results to your webhook
Your AI can poll or wait for the webhook callback

Can I use this with automation platforms (n8n, Make, Zapier)?

Not directly. MCP is designed for AI model interactions. For automation platforms, use:

These provide native nodes/actions optimized for workflow automation.

Still Have Questions?

Get help from our team with any questions not covered in this FAQ.

Contact Support