MCP Examples & Use Cases

View as markdown

Real-world examples of what you can build with the Scrapfly MCP Server-from simple data extraction to complex multi-step workflows. You don't write code-you just ask your AI in natural language, and it figures out which tools to use and how to chain them together.

Detailed Examples

Let's dive deeper into specific scenarios with full workflows.

Scenario

You want to find all remote Python developer jobs posted today on multiple job boards.

Prompt

"Search for remote Python developer jobs on LinkedIn, Indeed, and AngelList. Filter for positions posted in the last 24 hours. Create a summary table with company name, position, salary range, and application link."

What Happens Behind the Scenes

AI calls scraping_instruction_enhanced to understand best practices
AI uses web_get_page to scrape LinkedIn jobs page
AI uses web_scrape with extraction_model: "job_listing" for Indeed
AI uses web_get_page for AngelList
AI parses all results and filters by date
AI creates a formatted table with all matching positions

Scenario

Build an automated price tracking system that monitors products across multiple retailers, tracks historical trends, and identifies the best time to buy. Perfect for deal hunters, price comparison apps, or dynamic pricing strategies.

Prompt

"I want to buy Sony WH-1000XM5 headphones but I'm looking for the best deal. Check current prices on Amazon, Best Buy, Target, and Walmart. For each retailer, get the price, stock status, shipping options, and any current promotions or discounts. Tell me which retailer has the best overall value considering price, shipping, and availability. Also check if there are any bundle deals or extended warranties included."

What Happens Behind the Scenes

AI calls scraping_instruction_enhanced to get optimal scraping parameters
AI uses web_scrape with extraction_model: "product" for each retailer's product page
AI extracts comprehensive data: price, original price, discount %, stock status, shipping cost, delivery time, and warranty info
AI checks for promotional codes, bundle offers, and financing options
AI calculates total cost (price + shipping + taxes) for accurate comparison
AI compares value propositions: free shipping, faster delivery, return policies
AI generates detailed comparison with recommendation based on best overall value

Example Response

Price Comparison for Sony WH-1000XM5 Headphones

Retailer	Price	Shipping	Total	Stock	Delivery
Amazon	$349.99 -13%	Free	$349.99	In Stock	2-day Prime
Best Buy	$399.99	Free	$399.99	In Stock	Same-day pickup
Target	$379.99	$5.99	$385.98	In Stock	3-5 days
Walmart	$364.99	Free	$364.99	Low Stock	4-7 days

Best Overall Value: Amazon - $349.99

Lowest price: $50 off MSRP (13% discount)
Fast shipping: Free 2-day Prime delivery
Good stock: Available now, no delays
Easy returns: 30-day return policy
Total savings: $50 vs. Best Buy, $36 vs. Target

Additional Findings:

Best Buy: Includes free 2-year Geek Squad protection ($79 value) with purchase
Amazon: Currently has 15% off coupon available (clip coupon on product page)
Target: Extra 5% off with RedCard ($19 additional savings)
Walmart: Offers 24-month financing at 0% APR

Price History Insight:

Based on current pricing, this is the lowest price in the last 90 days. Historical average: $389. Great time to buy!

Pro Tip: Set up automated monitoring by scheduling this prompt to run daily. Use cache: true with cache_ttl: 3600 (1 hour) to track price changes efficiently. Combine with webhooks to get instant notifications when prices drop below your target.

Explore E-commerce & Price Monitoring Use Case

Scenario

Conduct comprehensive research by gathering, analyzing, and synthesizing information from multiple authoritative sources. Perfect for market research, academic literature reviews, trend analysis, or competitive intelligence.

Prompt

"I'm writing a whitepaper on quantum computing breakthroughs in 2024. Research the latest developments from MIT News, Nature.com, Quanta Magazine, and ArXiv from the past 3 months. For each breakthrough, extract: the discovery/advancement, lead researchers and institutions, publication date, practical applications, and any benchmarks or performance metrics mentioned. Identify common themes, compare different approaches (superconducting vs. photonic vs. topological qubits), and highlight which institutions are leading the field. Create a timeline of major announcements and summarize the most promising developments."

What Happens Behind the Scenes

AI calls scraping_instruction_enhanced for best practices
AI constructs search queries for each source (filtering by topic and date range)
AI uses web_scrape with extraction_model: "article" for article listing pages
AI visits individual articles and extracts: title, authors, publication date, abstract, key findings, methodologies, and citations
AI filters articles by publication date (last 90 days) and relevance score
AI identifies recurring themes, breakthrough categories, and research trends
AI maps researchers to institutions and tracks collaboration networks
AI extracts quantitative metrics: qubit counts, error rates, coherence times, gate fidelities
AI synthesizes findings into structured report with timeline, thematic analysis, and institutional rankings

Example Response

Quantum Computing Research Summary (Q1 2024)

Research Timeline:

March 15, 2024 - Major Breakthrough

IBM Achieves 1,000+ Qubit Processor

IBM Research announced "Condor," a 1,121 superconducting qubit processor with 20% improvement in error rates. Lead: Dr. Jay Gambetta (IBM Quantum). Source: Nature Physics

March 8, 2024

Google Demonstrates Error Correction Milestone

Achieved logical error rate below physical error rate using surface codes on 72 qubits. Lead: Dr. Hartmut Neven (Google Quantum AI). Source: Nature

February 22, 2024

QuEra Demonstrates Neutral-Atom Advantage

280-qubit neutral-atom system solves optimization problems 100x faster than classical computers. Lead: Dr. Mikhail Lukin (Harvard). Source: Science

January 30, 2024

PsiQuantum Secures $620M for Photonic Approach

Funding round to build fault-tolerant photonic quantum computer by 2027. Room-temperature operation advantage. Source: MIT News

Key Technology Comparisons:

Approach	Leading Institutions	Qubit Count	Key Advantage	Challenge
Superconducting	IBM, Google, Rigetti	1,000+	Scalability & speed	Requires cryogenic cooling
Neutral Atom	Harvard, QuEra, Pasqal	280+	Long coherence times	Gate fidelity improvements needed
Photonic	PsiQuantum, Xanadu	Theoretical	Room temperature operation	Manufacturing complexity
Trapped Ion	IonQ, Honeywell	32+	Highest gate fidelity (99.9%)	Slower operations

Leading Institutions by Impact:

IBM Research - 12 publications, focus on superconducting scalability
Google Quantum AI - 8 publications, error correction breakthroughs
Harvard University - 7 publications, neutral-atom innovations
MIT - 6 publications, algorithm development and materials science
Stanford - 5 publications, quantum networking protocols

Emerging Themes:

Error Correction Focus: 65% of papers emphasize quantum error correction as critical path to fault tolerance
Hybrid Approaches: Growing interest in combining classical and quantum processors for practical advantage
Application-Driven: Shift from "quantum supremacy" to solving real problems (drug discovery, optimization, cryptography)
Collaborations: 40% of breakthroughs involved multi-institutional partnerships

Most Promising Development:

Google's error correction milestone represents a turning point - demonstrating that logical qubits can be more reliable than physical qubits. This validates the path to fault-tolerant quantum computing within the next 5-10 years. Combined with IBM's 1,000+ qubit processors, we're entering the era of "utility-scale" quantum computing for practical applications.

Pro Tip: For ongoing research monitoring, use format: "markdown" to get clean, AI-friendly content. Chain this with sentiment analysis or citation tracking by asking the AI to extract reference networks. You can also use screenshots to capture figures, charts, and diagrams from papers for visual analysis.

Scenario

Extract complex, structured data from dynamic websites using AI-powered parsing. Perfect for building datasets, enriching CRM data, or scraping sites with inconsistent layouts that would be difficult to parse with traditional selectors.

Prompt

"I need a comprehensive dataset of top Italian restaurants in San Francisco for a food delivery partnership. Go to Yelp and extract the top 20 Italian restaurants with ratings above 4 stars. For each restaurant, get: name, exact rating (out of 5), number of reviews, price range ($-$$$$), cuisine tags, full address with zip code, phone number, business hours, popular dishes mentioned in reviews, delivery/takeout availability, health score if visible, and the restaurant's website URL. Also note if they have outdoor seating or take reservations. Export as a structured JSON that I can import into our database."

What Happens Behind the Scenes

AI calls scraping_instruction_enhanced to get optimal parameters
AI uses web_scrape with render_js: true (Yelp uses dynamic content)
AI provides detailed extraction_prompt specifying exact fields and data types
Scrapfly's LLM analyzes the page structure and intelligently extracts data across varying HTML layouts
AI handles edge cases: missing phone numbers, varied address formats, inconsistent pricing
AI may visit individual restaurant pages for additional details (hours, menu, reviews)
AI filters results for ratings ≥ 4.0 stars and sorts by relevance
AI validates data quality: checks phone format, ensures addresses are complete, normalizes price ranges
AI returns structured JSON with consistent schema, ready for database import

Extraction Tool Usage

{
  "tool": "web_scrape",
  "parameters": {
    "url": "https://www.yelp.com/search?find_desc=Italian&find_loc=San+Francisco",
    "pow": "...",
    "render_js": true,
    "extraction_prompt": "Extract detailed information for each Italian restaurant listed. For each restaurant, return a JSON object with these fields: restaurant_name (string), rating (float, out of 5.0), review_count (integer), price_range (string: $ to $$$$), cuisine_tags (array of strings), address (object with street, city, state, zip), phone (string, formatted), business_hours (object with day: hours), popular_dishes (array of strings from reviews), offers_delivery (boolean), offers_takeout (boolean), health_score (string if visible), website_url (string), has_outdoor_seating (boolean), accepts_reservations (boolean). Return as a JSON array of restaurant objects."
  }
}

Example Response (Excerpt)

Top Italian Restaurants - San Francisco

[
  {
    "restaurant_name": "Flour + Water",
    "rating": 4.5,
    "review_count": 3847,
    "price_range": "$$",
    "cuisine_tags": ["Italian", "Pasta Shop", "Wine Bar"],
    "address": {
      "street": "2401 Harrison St",
      "city": "San Francisco",
      "state": "CA",
      "zip": "94110"
    },
    "phone": "(415) 826-7000",
    "business_hours": {
      "Monday": "Closed",
      "Tuesday-Thursday": "5:30 PM - 10:00 PM",
      "Friday-Saturday": "5:30 PM - 11:00 PM",
      "Sunday": "5:30 PM - 10:00 PM"
    },
    "popular_dishes": [
      "Handmade Pasta",
      "Margherita Pizza",
      "Burrata",
      "Tiramisu"
    ],
    "offers_delivery": true,
    "offers_takeout": true,
    "health_score": "A",
    "website_url": "https://flourandwater.com",
    "has_outdoor_seating": true,
    "accepts_reservations": true
  },
  {
    "restaurant_name": "Delfina",
    "rating": 4.3,
    "review_count": 2912,
    "price_range": "$$",
    "cuisine_tags": ["Italian", "Californian"],
    "address": {
      "street": "3621 18th St",
      "city": "San Francisco",
      "state": "CA",
      "zip": "94110"
    },
    "phone": "(415) 552-4055",
    "business_hours": {
      "Monday-Sunday": "5:00 PM - 10:00 PM"
    },
    "popular_dishes": [
      "Spaghetti Carbonara",
      "Roasted Chicken",
      "Panna Cotta"
    ],
    "offers_delivery": false,
    "offers_takeout": true,
    "health_score": "A",
    "website_url": "https://delfinasf.com",
    "has_outdoor_seating": false,
    "accepts_reservations": true
  }
  // ... 18 more restaurants
]

Data Quality Summary:

✅ 20 restaurants extracted
✅ All ratings ≥ 4.0 stars (filtered as requested)
✅ 100% have valid addresses with zip codes
✅ 95% have phone numbers (1 missing phone not listed on Yelp)
✅ 85% have business hours (3 restaurants don't list hours)
✅ All data validated and normalized for database import

When to Use Extraction Prompt vs. Extraction Model

Use extraction_model: For standard schemas (products, articles, jobs) - faster and more cost-effective
Use extraction_prompt: For custom fields, complex nested data, or when you need specific filtering/validation logic
Combine both: Start with extraction_model for base data, then use extraction_prompt to enrich with custom fields

Pro Tip: LLM-powered extraction excels at handling inconsistent HTML structures, but it's more expensive than CSS selectors. Use it for complex, one-time scraping tasks or when sites frequently change their markup. For production scraping at scale, consider using extraction_model with pre-trained schemas or traditional selector-based extraction for consistent sites.

Scenario

Track competitor features, pricing changes, and marketing strategies across multiple SaaS platforms to inform your product roadmap.

Prompt

"Analyze my top 3 SaaS competitors: Ahrefs, SEMrush, and Moz. For each one, extract their pricing tiers, features included in each tier, and any special promotions on their pricing pages. Also check their blogs for recent product announcements from the last month. Create a comparison table showing feature gaps and pricing strategies."

What Happens Behind the Scenes

AI calls scraping_instruction_enhanced to get optimal scraping parameters
AI uses web_scrape with render_js: true to load each competitor's pricing page (dynamic content)
AI uses extraction_prompt to extract: pricing tiers, features per tier, discounts, and trial terms
AI visits each blog/news page and uses extraction_model: "article" to get recent announcements
AI filters articles by publication date (last 30 days)
AI analyzes and compares pricing structures, feature availability, and positioning
AI generates a competitive analysis report with feature gaps, pricing insights, and strategic recommendations

Example Response

Competitive Analysis Summary

Pricing Comparison:

Competitor	Entry Tier	Mid Tier	Enterprise
Ahrefs	$99/mo	$179/mo	$399/mo
SEMrush	$119/mo	$229/mo	$449/mo
Moz	$79/mo	$149/mo	$249/mo

Key Feature Gaps:

Missing Real-time rank tracking (Ahrefs & SEMrush have it)
Missing Content optimization AI (SEMrush exclusive)
Advantage More generous API rate limits vs. Moz

Scenario

Research the real estate market in a specific area by aggregating property listings, analyzing price trends, and comparing neighborhoods.

Prompt

"I'm looking to invest in rental properties in Austin, Texas. Search Zillow, Redfin, and Realtor.com for 3-bedroom properties under $500k in the following zip codes: 78704, 78702, and 78751. For each listing, extract: address, price, square footage, bedrooms/bathrooms, estimated rental income, year built, and listing URL. Create a summary showing which neighborhood has the best rental yield and price per square foot."

What Happens Behind the Scenes

AI calls scraping_instruction_enhanced to get best practices for real estate scraping
AI constructs search URLs for each platform with filters (location, price, bedrooms)
AI uses web_scrape with extraction_model: "real_estate_property_listing" for listing pages
AI visits individual property detail pages using extraction_model: "real_estate_property" to get full details
AI extracts key data: price, size, features, rental estimates, HOA fees, property tax
AI calculates rental yield: (annual rental income / property price) × 100
AI calculates price per square foot for each property
AI groups by neighborhood (zip code) and generates comparative analysis with investment recommendations

Example Response

Austin Real Estate Investment Analysis

Neighborhood Comparison:

Zip Code	Avg Price	Avg $/sqft	Est. Rental Yield	Properties Found
78702 (East Austin)	$425,000	$298	6.8%	12
78704 (South Austin)	$485,000	$342	5.2%	8
78751 (Hyde Park)	$495,000	$365	4.9%	5

Top Investment Opportunities:

1. 1402 E 6th St, Austin TX 78702 - Best Value

Price: $399,000 | Size: 1,350 sqft | $/sqft: $295
Est. Monthly Rent: $2,800 | Rental Yield: 8.4%
Year Built: 2018 | HOA: None
View on Zillow →

2. 3312 Govalle Ave, Austin TX 78702

Price: $415,000 | Size: 1,420 sqft | $/sqft: $292
Est. Monthly Rent: $2,650 | Rental Yield: 7.7%
Year Built: 2020 | HOA: $50/mo
View on Redfin →

Investment Recommendation:

Best Area: 78702 (East Austin)

Highest rental yield: 6.8% average (vs. 5.2% in 78704)
Best value: $298/sqft (16% cheaper than 78751)
Strong rental demand: Near downtown, UT campus, and tech offices
Market trend: Appreciating 8.2% YoY based on recent sales data

Pro Tip: Use screenshots parameter to capture property photos for visual comparison. You can also chain this with review scraping using extraction_model: "review_list" to research neighborhood safety and amenities on platforms like Nextdoor or Google Maps.

Explore Real Estate Use Case