n8n

n8n logo

Powerful workflow automation platform. Integrate Scrapfly web scraping into your n8n workflows for automated data collection, monitoring, and processing pipelines.

Workflow Automation Cloud Self-Hosted Official Website

Prerequisites

Before getting started, make sure you have the following:

Setup Instructions

Integrate Scrapfly into your n8n workflows using HTTP Request nodes. This enables powerful automated web scraping pipelines.

  1. Create New n8n Workflow

    Start a new workflow in n8n:

    1. Log in to your n8n instance
    2. Click "New Workflow"
    3. Add a trigger node (Schedule, Webhook, Manual, etc.)
    Tip: Common Trigger Nodes
    • Schedule Trigger: Run scraping on a schedule (hourly, daily, etc.)
    • Webhook: Trigger scraping via HTTP request
    • Manual Trigger: Run scraping on demand
    • Email Trigger: Scrape URLs from incoming emails
  2. Add HTTP Request Node for Scrapfly

    Configure an HTTP Request node to call the Scrapfly API:

    1. Click the "+" button to add a new node
    2. Search for and select "HTTP Request"
    3. Configure the node with these settings:

    HTTP Request Configuration:

    • Method: GET
    • URL: https://api.scrapfly.io/scrape
    • Query Parameters:
      • key: __API_KEY__
      • url: {{ $json.url }} (or hardcoded URL)
      • format: markdown
    • Response Format: JSON
    Sign up for free to get your API key.
    Tip: Using Credentials

    Instead of hardcoding the API key, create an n8n credential:

    • Go to Credentials → Create New
    • Choose "Header Auth"
    • Add parameter: X-API-Key with your Scrapfly API key
    • Select this credential in the HTTP Request node
  3. Process Scrapfly Response

    Extract and process the scraped content using n8n's data transformation nodes:

    Extract Content with Set Node:

    1. Add a "Set" node after the HTTP Request
    2. Configure it to extract the content:
    • Field Name: content
    • Value: {{ $json.result.content }}

    Alternative: Use Code Node for Complex Processing:

    Pro Tip: Use the Code node to clean, parse, or transform scraped data before saving it!
  4. Add Output Action

    Send the scraped data to your desired destination:

    Popular Output Nodes:

    • Google Sheets: Save data to a spreadsheet
    • Airtable: Store in a database
    • Slack/Discord: Send alerts or notifications
    • Email: Email reports
    • Webhook: Send to your custom API
    • File: Save as JSON, CSV, or other formats
    Example: Save to Google Sheets

    Add a Google Sheets node with these settings:

    • Operation: Append Row
    • Spreadsheet: Select your sheet
    • Columns: Map url, content, scrapedAt
  5. Test and Activate Workflow

    Test your workflow and activate it for production use:

    1. Click "Execute Workflow" to test manually
    2. Review the output from each node
    3. Once working correctly, toggle the workflow to "Active"
    4. Monitor executions in the "Executions" tab
    Tip: Error Handling

    Add error handling to your workflow:

    • Enable "Continue On Fail" on HTTP Request node
    • Add an IF node to check for errors: {{ $json.error }}
    • Route errors to a notification or logging node

Example Workflow Templates

Daily News Aggregator
Schedule daily at 9am: scrape top stories from multiple news sites and email digest
Price Monitor
Monitor competitor pricing hourly and send Slack alert on price changes
Competitive Intelligence
Scrape competitor sites on webhook trigger, parse data and save to Airtable
Content Aggregation
Schedule: scrape blog RSS feeds, extract articles, summarize with AI, post to CMS

Troubleshooting

Problem: HTTP Request to Scrapfly returns errors

Solution:

  • Verify API key is correct in query parameters
  • Check URL parameter is properly formatted (http:// or https://)
  • Review error message in node execution data
  • Test API call in browser or Postman first

Problem: Cannot access scraped content from response

Solution:

  • Use correct JSON path: {{ $json.result.content }}
  • Check "Response Format" is set to "JSON"
  • Inspect node output data to see response structure
  • Add a Set node to extract specific fields

Problem: Workflow times out during scraping

Solution:

  • Increase timeout in HTTP Request node settings
  • Disable JavaScript rendering if not needed (faster scraping)
  • Split large jobs into smaller batches
  • Check n8n instance timeout settings

Problem: Hitting Scrapfly API rate limits

Solution:

  • Add "Wait" nodes between scraping operations
  • Use Scrapfly cache parameter for repeat requests
  • Reduce scraping frequency in schedule trigger
  • Upgrade Scrapfly plan for higher limits

Problem: n8n credentials not applying API key correctly

Solution:

  • Use query parameter method instead of header auth
  • Verify credential is selected in HTTP Request node
  • Check credential type matches authentication method
  • Test with hardcoded API key first to isolate issue

Problem: Cannot activate workflow or schedule trigger does not fire

Solution:

  • Ensure all required fields are filled in trigger node
  • Check n8n instance has active executions enabled
  • Review execution logs for activation errors
  • Verify schedule trigger timezone settings

Alternative Automation Platforms

While n8n offers powerful open-source automation, you might also want to explore these no-code automation alternatives for Scrapfly integration:

Make Make

Visual automation platform with advanced workflow capabilities. Best for complex multi-step scenarios without self-hosting.

  • Fully-managed cloud platform
  • Visual drag-and-drop scenario builder
  • Enterprise support options
Zapier Zapier

Simpler trigger-action automation connecting 5000+ apps. Best for straightforward workflows with minimal setup.

  • Largest app ecosystem (5000+ integrations)
  • Easier learning curve for beginners
  • Instant setup, no hosting needed

Next Steps

Summary