Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety & Security
Evaluation
Meta
Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety & Security
Evaluation
Meta
This is an old revision of the document!
Browser-Use is a popular open-source Python library that enables AI agents to autonomously control web browsers using natural language instructions. Built on top of Playwright and integrated with LangChain, it allows LLMs like GPT-4o and Claude to navigate websites, fill forms, extract data, and perform complex multi-step web tasks. With over 50,000 GitHub stars, it has become the leading framework for agent-driven browser automation.
Browser-Use follows a modular, agent-based architecture with three core components:
BrowserProfile for headless mode, viewport size, and user-agent.ChatOpenAI, ChatAnthropic) for decision-making. The LLM interprets DOM content, screenshots, and page state to determine the next action.The agent loop works as follows: observe the page state (DOM + optional screenshot) → send to LLM → receive action → execute via Playwright → repeat until task complete.
Browser-Use relies on Playwright as its browser automation engine. Rather than requiring developers to write Playwright scripts, the library abstracts browser control behind the Agent interface:
For cloud deployments, Browser-Use connects to Browserless or similar services via WebSocket CDP URLs, avoiding the need for local browser installations.
Browser-Use is designed as a LangChain-native tool:
langchain_openai.ChatOpenAI or langchain_anthropic.ChatAnthropic as the reasoning engineimport asyncio from dotenv import load_dotenv from langchain_openai import ChatOpenAI from browser_use import Agent, BrowserSession, BrowserProfile load_dotenv() async def main(): # Configure browser session session = BrowserSession( browser_profile=BrowserProfile(headless=True) ) # Create agent with GPT-4o agent = Agent( task="Go to Hacker News, find the top post, and return its title and URL.", llm=ChatOpenAI(model="gpt-4o"), browser=session, ) # Run the agent result = await agent.run() print(f"Result: {result}") asyncio.run(main())
┌─────────────┐
│ User Task │
│ (natural │
│ language) │
└──────┬──────┘
│
┌──────▼──────┐
│ Agent │
│ (reasoning │
│ loop) │
└──┬──────┬──┘
│ │
┌────────▼┐ ┌──▼────────┐
│ LLM │ │ Browser │
│ (GPT-4o │ │ Session │
│ Claude) │ │(Playwright│
└─────────┘ │ CDP) │
└─────┬────┘
│
┌──────▼──────┐
│ Browser │
│ (Chromium/ │
│ Firefox) │
└─────────────┘