Scrapfly CrewAI Integration

Scrapfly is available on CrewAI - a popular a framework for developing applications powered by large language models (LLMs).

For CrewAI, Scrapfly is available as a CrewAI Tool object which uses Scrapfly Web Scrape API to retrieve web page data for use within the LangChain ecosystem.

Usage

To start get your Srapfly API key on your dashboard. Then install Scrapfly Python SDK, CrewAI and CrewAI Tools:

Then, the ScrapflyScrapeWebsiteTool is available for scraping any web page:

For more advanced use, the integration supports all Scrapfly Web Scrape API options matching the Python SDK signature which can be provided as a natural language parameter:

Example Use

CrewAI is a framework that works by defining Tasks, Agents, and Crews. A Crew is a group of Agents working together toward a common goal. Each Agent is assigned one or more Tasks, and the Crew orchestrates their collaboration to perform the given tasks. Let's define the required objects. We'll define a scrape task, and agent to perform it, and finally group them in a crew for execution:

In the above example, we define a complete CrewAi workflow by combining an agent, a task, and a tool. Let's break down the execution flow:

  • The ScrapflyScrapeWebsiteTool is initialized with an API key to handle web scraping.
  • An Agent is created with the role of a research analyst, powered with both an LLM (gpt-4) and the scraping tool defined earlier.
  • A Task is defined to extract and summarize product information from a target web page with the required scrape configurations from Scrapfly.
  • Finally, a Crew is defined by grouping the agent and the task together. When crew.kickoff() is called, the agent executes the task using its tools and LLM, and the final summarized result is returned.

Errors

CrewAI will display the Scrapfly API error message in the standard Scrapfly API error message format. For more see:

Pricing

No additional costs.

Summary