Browser Use is an AI-powered browser automation framework that enables AI agents to control browsers using natural language commands.
Connect it to Scrapfly Cloud Browser for scalable AI-driven automation with built-in proxies and fingerprinting.
Beta Feature: Cloud Browser is currently in beta.
What is Browser Use?
Browser Use is an open-source Python framework that combines Large Language Models (LLMs) with browser automation.
Instead of writing explicit automation code, you give the AI agent natural language instructions like "find product prices" or "fill out this form",
and the agent figures out how to accomplish the task using browser interactions.
AI-Powered
Uses LLMs to understand tasks and interact with websites intelligently. No hardcoded selectors needed.
Natural Language
Describe tasks in plain English. The AI agent translates your intent into browser actions.
Self-Adapting
Handles dynamic page structures and adapts to changes without updating code.
Installation
Install Browser Use using pip (requires Python 3.11+):
pip install browser-use playwright
Browser Use uses Playwright under the hood for browser control.
When connecting to Cloud Browser, Playwright connects via CDP (Chrome DevTools Protocol) to remote browser instances.
Quick Start
Connect Browser Use to Cloud Browser and run AI-powered tasks:
import asyncio
from langchain_openai import ChatOpenAI
from browser_use import Agent, BrowserConfig
API_KEY = ''
BROWSER_WS = f'wss://' ~ public_cloud_browser_endpoint ~ '?api_key={API_KEY}&proxy_pool=datacenter&os=linux'
async def run_agent():
# Configure remote browser connection
browser_config = BrowserConfig(
headless=True,
disable_security=True,
cdp_url=BROWSER_WS
)
# Create AI agent with natural language task
agent = Agent(
task="Go to https://web-scraping.dev and find the product prices",
llm=ChatOpenAI(model="gpt-4"),
browser_config=browser_config
)
# Run the agent
result = await agent.run()
print("Agent result:", result)
asyncio.run(run_agent())
Data Extraction with AI
Use Browser Use to extract structured data from websites using natural language instructions:
import asyncio
from langchain_openai import ChatOpenAI
from browser_use import Agent, BrowserConfig
API_KEY = ''
BROWSER_WS = f'wss://' ~ public_cloud_browser_endpoint ~ '?api_key={API_KEY}&proxy_pool=datacenter'
async def extract_data():
browser_config = BrowserConfig(
headless=True,
disable_security=True,
cdp_url=BROWSER_WS
)
agent = Agent(
task=(
"Go to https://web-scraping.dev/products and extract all product information. "
"For each product, get the title, price, and description. "
"Return the results as a structured list."
),
llm=ChatOpenAI(model="gpt-4"),
browser_config=browser_config
)
result = await agent.run()
# The agent returns structured results
print("Extracted products:", result)
return result
asyncio.run(extract_data())
Task Automation
Automate complex multi-step workflows with natural language:
import asyncio
from langchain_openai import ChatOpenAI
from browser_use import Agent, BrowserConfig
API_KEY = ''
BROWSER_WS = f'wss://' ~ public_cloud_browser_endpoint ~ '?api_key={API_KEY}&proxy_pool=residential'
async def automate_task():
browser_config = BrowserConfig(
headless=True,
disable_security=True,
cdp_url=BROWSER_WS
)
agent = Agent(
task=(
"1. Go to https://web-scraping.dev/search\n"
"2. Search for 'web scraping tools'\n"
"3. Click on the first result\n"
"4. Extract the main heading and first paragraph\n"
"5. Take a screenshot of the page"
),
llm=ChatOpenAI(model="gpt-4"),
browser_config=browser_config
)
result = await agent.run()
print("Task completed:", result)
return result
asyncio.run(automate_task())
Session Persistence
Maintain browser state across AI agent runs using the session parameter:
import asyncio
from langchain_openai import ChatOpenAI
from browser_use import Agent, BrowserConfig
API_KEY = ''
SESSION_ID = 'my-ai-session'
async def login_session():
"""First agent run: Login to a website"""
browser_ws = f'wss://' ~ public_cloud_browser_endpoint ~ '?api_key={API_KEY}&session={SESSION_ID}'
browser_config = BrowserConfig(
headless=True,
disable_security=True,
cdp_url=browser_ws
)
agent = Agent(
task="Go to https://web-scraping.dev/login and login with username 'demo' and password 'demo123'",
llm=ChatOpenAI(model="gpt-4"),
browser_config=browser_config
)
await agent.run()
print("Login completed - session preserved")
async def use_logged_in_session():
"""Second agent run: Use the logged-in session"""
browser_ws = f'wss://' ~ public_cloud_browser_endpoint ~ '?api_key={API_KEY}&session={SESSION_ID}'
browser_config = BrowserConfig(
headless=True,
disable_security=True,
cdp_url=browser_ws
)
agent = Agent(
task="Go to the user dashboard and extract account information",
llm=ChatOpenAI(model="gpt-4"),
browser_config=browser_config
)
result = await agent.run()
print("Dashboard data:", result)
# Run both tasks sequentially
asyncio.run(login_session())
asyncio.run(use_logged_in_session())
Proxy Options
Proxy Pool
Use Case
Cost
datacenter
General AI automation, high speed, lower cost
1 credits/30s + 2 credits/MB
residential
Protected sites, geo-targeting, anti-bot bypass
1 credits/30s + 10 credits/MB
Best Practices
Be specific with tasks - Clear, detailed instructions help the AI agent succeed
Use structured outputs - Ask the agent to return data in specific formats (JSON, lists, etc.)
Handle failures gracefully - Wrap agent runs in try/catch and provide fallback logic
Monitor costs - AI agents may take longer than traditional automation. Always close browser sessions when done.
Use sessions wisely - Reuse sessions for multi-step workflows to maintain state
Choose the right LLM - More capable models (GPT-4) perform better but cost more. Test with different models.
Test tasks iteratively - Start with simple tasks and gradually increase complexity
WebSocket URL Format
Cloud Browser WebSocket URLs support the following parameters: