🚀 We are hiring! See open positions

How to Create an AI Browser Agent for Free

by Hisham Mar 24, 2026 8 min read
How to Create an AI Browser Agent for Free How to Create an AI Browser Agent for Free

AI agents that control web browsers through natural language are here, and you can build one today without spending a dollar. Browser-Use and Stagehand are the two most popular open-source frameworks for AI browser automation, with a combined 100k+ GitHub stars between them. Both run locally, use Google's free Gemini API, and give you full control over the source code. Starting free means no vendor lock-in and you learn how these tools actually work before committing to any paid service.

In this guide, we'll compare both frameworks head-to-head, then walk through a full quickstart for each with working code examples. Let's get started!

Stagehand vs Browser-Use: Pick Your Path

Before writing any code, pick the framework that fits your stack and your workflow. Both build on Playwright, so you get the same browser engine underneath. The difference is how they use AI.

Feature Stagehand Browser-Use
Language TypeScript, Python, Go, Ruby Python
Built on Playwright Playwright
Approach Hybrid (AI + code) Pure agent
Core API act(), extract(), observe() Agent.run()
Free LLM Gemini Gemini, Ollama
GitHub Stars ~21k ~78k
Best for Precise control Full autonomy

When to Choose Browser-Use

You're a Python developer who wants full agent autonomy. You give the agent a task in plain English, and it figures out the clicks, scrolling, and extraction on its own. Browser-Use ships with built-in model wrappers for OpenAI, Anthropic, and Google Gemini, so setup is a single import.

More importantly, Browser-Use supports Ollama for fully local inference. Ollama is the only option that costs literally zero dollars. No API key, no rate limits, no data leaving your machine. If privacy or offline use matters, Browser-Use with Ollama is the way to go.

When to Choose Stagehand

You want a hybrid approach that mixes AI with traditional code. Stagehand gives you three AI-powered primitives: act() for clicking and typing, extract() for pulling structured data, and observe() for understanding page state. You combine these with standard Playwright commands and keep full control over the execution flow.

This makes scripts easier to debug and maintain. When something breaks, you know exactly which step failed because you wrote the flow yourself.

Browserbase maintains official Stagehand SDKs in TypeScript, Python, Go, and Ruby. TypeScript is the primary SDK with the most features, but the Python SDK mirrors the same API using Pydantic models for schema validation.

Quickstart: Browser-Use (Python)

Browser-Use takes a "pure agent" approach. You describe a task in natural language, and the agent handles navigation, clicking, scrolling, and data extraction on its own. Let's build one.

Prerequisites

  • Python 3.11+
  • A Google account for a free Gemini API key
  • Playwright browsers installed locally

Get your free Gemini API key at Google AI Studio. Click "Get API key" in the left down sidebar, then click "Create API key." No credit card needed. The free tier gives you 10 requests per minute with Gemini 2.5 Flash, which is plenty for step-by-step agent development.

Building Your First Agent

Installation

shell
pip install browser-use
playwright install chromium

Agent Code

python
import os
import asyncio

from browser_use import Agent, ChatGoogle

os.environ["GOOGLE_API_KEY"] = "Your Google API Key"

async def main():
    llm = ChatGoogle(model="gemini-flash-latest")

    agent = Agent(
        task=(
            "Go to https://web-scraping.dev/products and extract "
            "the name and price of every product on this page only"
        ),
        llm=llm,
    )

    result = await agent.run()
    print(result.final_result())

if __name__ == "__main__":
    asyncio.run(main())

Browser-Use ships with a built-in ChatGoogle wrapper, so you don't need any extra LangChain packages. The Agent class takes a natural language task and an LLM, launches a browser, and extracts data without a single CSS selector.

Browser-Use also works with Ollama for fully local inference. Swap ChatGoogle for a local model and you won't need any API key at all. Check the Browser-Use docs for supported model setup.

If you need typed output instead of plain text, Browser-Use supports Pydantic models for structured extraction. The official documentation covers this in the output models section.

Quickstart: Stagehand (TypeScript)

Stagehand takes a different path. Instead of handing everything to an autonomous agent, it gives you three AI-powered methods that you weave into regular Playwright code. You control the flow. AI handles the parts where writing selectors would be fragile or tedious.

Prerequisites

You'll use the same free Gemini API key from Google AI Studio. If you already got one for the Browser-Use example above, it will work here too.

Building Your First Agent

Installation

Scaffold a new project with the Stagehand CLI:

shell
npx create-browser-app my-agent
cd my-agent && npm install

Or add Stagehand to an existing project:

shell
npm i @browserbasehq/stagehand zod dotenv

Agent Code

typescript
process.env.GOOGLE_GENERATIVE_AI_API_KEY = "Your Google API Key";

import { Stagehand } from "@browserbasehq/stagehand";
import { z } from "zod";

async function main() {
  const stagehand = new Stagehand({
    env: "LOCAL",
    model: "google/gemini-2.5-flash",
  });

  await stagehand.init();
  const page = stagehand.context.pages()[0];

  await page.goto("https://web-scraping.dev/products");
  await stagehand.act("Click on the first product link");

  const product = await stagehand.extract(
    "Extract the product details from this page",
    z.object({
      name: z.string().describe("the product name"),
      price: z.string().describe("the product price"),
      description: z.string().describe("the product description"),
    }),
  );

  console.log("Product:", product);
  await stagehand.close();
}

main();

You write the navigation flow with page.goto() and control the order of operations. AI steps in only where you call stagehand.act() or stagehand.extract(). If the page data doesn't match your Zod schema, you get a validation error right away.

The observe() method helps when you're exploring an unfamiliar page. Call await stagehand.observe("What links are available?") and Stagehand returns structured observations about the current page state. It's useful for building scripts that need to adapt to changing page layouts.

Stagehand also supports a pure agent mode for fully autonomous tasks, similar to Browser-Use. See the Stagehand documentation for the agent API and advanced patterns.

When Free Isn't Enough

Running agents on your local machine works well for development and small projects. But local setups hit real limits once you try to scale up or target protected sites.

A single machine can only run one browser at a time without extra setup. Anti-bot systems flag your home or server IP quickly.

Modern detection systems check browser fingerprints like canvas rendering, WebGL hashes, font lists, and TLS signatures. They spot automated browsers almost right away. On top of that, managing sessions, cookies, and browser state across long-running tasks gets messy fast.

When you're ready for production workloads, need to get past anti-bot systems, or want to run multiple agents in parallel, cloud browsers are the practical next step.

scrapfly middleware

Scrapfly's Cloud Browser works with both Browser-Use and Stagehand. Your agent code stays the same. You just swap the local browser connection for a cloud endpoint. Built-in residential proxies and managed fingerprinting let your agents access sites that would block a local browser right away.

For step-by-step setup, check the Stagehand + Scrapfly integration guide and the Browser-Use + Scrapfly integration guide:

typescript
import { Stagehand } from "@browserbasehq/stagehand";

async function main() {
  // initialize Stagehand with external CDP browser connection
  const stagehand = new Stagehand({
    env: "LOCAL",  // local means are we aren't using browserbase, we are using a Scrapfly's cloud browser over cdp
    localBrowserLaunchOptions: {
      // connect to Scrapfly's cloud browser via CDP
      cdpUrl: "wss://browser.scrapfly.io?key=YOUR_API_KEY"
    }
  });

  // initialize the stagehand
  await stagehand.init();

  // create an agent with specific model configuration
  const agent = stagehand.agent({
    model: {
      // choose the llm model to use and add its api key
      modelName: "openai/computer-use-preview",
      apiKey: "YOUR_API_KEY"
    },
    // add a system prompt to the agent
    systemPrompt: "You are an AI browser automation agent. Follow instructions precisely."
  });

  // define the task to be performed by the agent
  const task = `
    Go to https://web-scraping.dev/products
    Extract the product names and prices
    Return the data as JSON
  `;

  // execute the agent workflow
  const result = await agent.execute(task);

  console.log("Agent workflow result:", result);

  // close the connection
  await stagehand.close();
}

main();

FAQ

Is Browser-Use really free?

Yes. Browser-Use is open-source and works with Google Gemini's free tier or fully local models through Ollama.

Can I use Stagehand with Python?

Yes. Browserbase maintains an official Python SDK that mirrors the TypeScript API using Pydantic models.

Do I need a paid API key for either framework?

No. Both work with Google Gemini's free tier, which gives you 10 requests per minute at no cost.

Summary

You've built two free AI browser agents from scratch. Browser-Use gives you a Python-first, fully autonomous agent that figures out navigation and extraction from a single task description. Stagehand gives you fine-grained control with AI-powered primitives you can mix into Playwright scripts.

Both frameworks are actively maintained and ready for real work. Pick Browser-Use if you want pure autonomy in Python. Go with Stagehand if you want hybrid control in TypeScript. Either way, you're running a real AI browser agent without spending a cent.

Try the examples above with your free Gemini key, then explore Scrapfly's Cloud Browser when you're ready to scale past your local machine.

Scale Your Web Scraping
Anti-bot bypass, browser rendering, and rotating proxies — all in one API. Start with 1,000 free credits.
No credit card required 1,000 free API credits Anti-bot bypass included
Not ready? Get our newsletter instead.