Browser Automation with browser-use CLI
The browser-use command provides fast, persistent browser automation. It maintains browser sessions across commands, enabling complex multi-step workflows.
Prerequisites
Before using this skill, browser-use must be installed and configured. Run diagnostics to verify:
For more information, see https://github.com/browser-use/browser-use/blob/main/browser_use/skill_cli/README.md
Core Workflow
- Navigate:
browser-use open <url> - Opens URL (starts browser if needed)
- Inspect:
browser-use state - Returns clickable elements with indices
- Interact: Use indices from state to interact (
browser-use click 5, browser-use input 3 "text")
- Verify:
browser-use state or browser-use screenshot to confirm actions
- Repeat: Browser stays open between commands
Browser Modes
bash
1browser-use --browser chromium open <url> # Default: headless Chromium
2browser-use --browser chromium --headed open <url> # Visible Chromium window
3browser-use --browser real open <url> # Real Chrome (no profile = fresh)
4browser-use --browser real --profile "Default" open <url> # Real Chrome with your login sessions
5browser-use --browser remote open <url> # Cloud browser
- chromium: Fast, isolated, headless by default
- real: Uses a real Chrome binary. Without
--profile, uses a persistent but empty CLI profile at ~/.config/browseruse/profiles/cli/. With --profile "ProfileName", copies your actual Chrome profile (cookies, logins, extensions)
- remote: Cloud-hosted browser with proxy support
Essential Commands
bash
1# Navigation
2browser-use open <url> # Navigate to URL
3browser-use back # Go back
4browser-use scroll down # Scroll down (--amount N for pixels)
5
6# Page State (always run state first to get element indices)
7browser-use state # Get URL, title, clickable elements
8browser-use screenshot # Take screenshot (base64)
9browser-use screenshot path.png # Save screenshot to file
10
11# Interactions (use indices from state)
12browser-use click <index> # Click element
13browser-use type "text" # Type into focused element
14browser-use input <index> "text" # Click element, then type
15browser-use keys "Enter" # Send keyboard keys
16browser-use select <index> "option" # Select dropdown option
17
18# Data Extraction
19browser-use eval "document.title" # Execute JavaScript
20browser-use get text <index> # Get element text
21browser-use get html --selector "h1" # Get scoped HTML
22
23# Wait
24browser-use wait selector "h1" # Wait for element
25browser-use wait text "Success" # Wait for text
26
27# Session
28browser-use sessions # List active sessions
29browser-use close # Close current session
30browser-use close --all # Close all sessions
31
32# AI Agent
33browser-use -b remote run "task" # Run agent in cloud (async by default)
34browser-use task status <id> # Check cloud task progress
Commands
Navigation & Tabs
bash
1browser-use open <url> # Navigate to URL
2browser-use back # Go back in history
3browser-use scroll down # Scroll down
4browser-use scroll up # Scroll up
5browser-use scroll down --amount 1000 # Scroll by specific pixels (default: 500)
6browser-use switch <tab> # Switch to tab by index
7browser-use close-tab # Close current tab
8browser-use close-tab <tab> # Close specific tab
Page State
bash
1browser-use state # Get URL, title, and clickable elements
2browser-use screenshot # Take screenshot (outputs base64)
3browser-use screenshot path.png # Save screenshot to file
4browser-use screenshot --full path.png # Full page screenshot
State output format:
The state command returns structured text with element indices in [index]<element> format:
url: https://example.com
title: Example Page
Interactive elements:
*[123]<button>Submit</button>
[124]<a href="/link">Click here</a>
[125]<input type="text" placeholder="Enter text" />
[123] is the element index used with click, input, etc.
* prefix indicates the element is visible/focused
- Indentation (tabs) indicates DOM nesting
- Element attributes like
aria-label, role, href are included
Interactions
bash
1browser-use click <index> # Click element
2browser-use type "text" # Type text into focused element
3browser-use input <index> "text" # Click element, then type text
4browser-use keys "Enter" # Send keyboard keys
5browser-use keys "Control+a" # Send key combination
6browser-use select <index> "option" # Select dropdown option
7browser-use hover <index> # Hover over element (triggers CSS :hover)
8browser-use dblclick <index> # Double-click element
9browser-use rightclick <index> # Right-click element (context menu)
Use indices from browser-use state.
JavaScript & Data
bash
1browser-use eval "document.title" # Execute JavaScript, return result
2browser-use get title # Get page title
3browser-use get html # Get full page HTML
4browser-use get html --selector "h1" # Get HTML of specific element
5browser-use get text <index> # Get text content of element
6browser-use get value <index> # Get value of input/textarea
7browser-use get attributes <index> # Get all attributes of element
8browser-use get bbox <index> # Get bounding box (x, y, width, height)
Cookies
bash
1browser-use cookies get # Get all cookies
2browser-use cookies get --url <url> # Get cookies for specific URL
3browser-use cookies set <name> <value> # Set a cookie
4browser-use cookies set name val --domain .example.com --secure --http-only
5browser-use cookies set name val --same-site Strict # SameSite: Strict, Lax, or None
6browser-use cookies set name val --expires 1735689600 # Expiration timestamp
7browser-use cookies clear # Clear all cookies
8browser-use cookies clear --url <url> # Clear cookies for specific URL
9browser-use cookies export <file> # Export all cookies to JSON file
10browser-use cookies export <file> --url <url> # Export cookies for specific URL
11browser-use cookies import <file> # Import cookies from JSON file
Wait Conditions
bash
1browser-use wait selector "h1" # Wait for element to be visible
2browser-use wait selector ".loading" --state hidden # Wait for element to disappear
3browser-use wait selector "#btn" --state attached # Wait for element in DOM
4browser-use wait text "Success" # Wait for text to appear
5browser-use wait selector "h1" --timeout 5000 # Custom timeout in ms
Python Execution
bash
1browser-use python "x = 42" # Set variable
2browser-use python "print(x)" # Access variable (outputs: 42)
3browser-use python "print(browser.url)" # Access browser object
4browser-use python --vars # Show defined variables
5browser-use python --reset # Clear Python namespace
6browser-use python --file script.py # Execute Python file
The Python session maintains state across commands. The browser object provides:
browser.url, browser.title, browser.html — page info
browser.goto(url), browser.back() — navigation
browser.click(index), browser.type(text), browser.input(index, text), browser.keys(keys) — interactions
browser.screenshot(path), browser.scroll(direction, amount) — visual
browser.wait(seconds), browser.extract(query) — utilities
Agent Tasks
Remote Mode Options
When using --browser remote, additional options are available:
bash
1# Specify LLM model
2browser-use -b remote run "task" --llm gpt-4o
3browser-use -b remote run "task" --llm claude-sonnet-4-20250514
4
5# Proxy configuration (default: us)
6browser-use -b remote run "task" --proxy-country uk
7
8# Session reuse
9browser-use -b remote run "task 1" --keep-alive # Keep session alive after task
10browser-use -b remote run "task 2" --session-id abc-123 # Reuse existing session
11
12# Execution modes
13browser-use -b remote run "task" --flash # Fast execution mode
14browser-use -b remote run "task" --wait # Wait for completion (default: async)
15
16# Advanced options
17browser-use -b remote run "task" --thinking # Extended reasoning mode
18browser-use -b remote run "task" --no-vision # Disable vision (enabled by default)
19
20# Using a cloud profile (create session first, then run with --session-id)
21browser-use session create --profile <cloud-profile-id> --keep-alive
22# → returns session_id
23browser-use -b remote run "task" --session-id <session-id>
24
25# Task configuration
26browser-use -b remote run "task" --start-url https://example.com # Start from specific URL
27browser-use -b remote run "task" --allowed-domain example.com # Restrict navigation (repeatable)
28browser-use -b remote run "task" --metadata key=value # Task metadata (repeatable)
29browser-use -b remote run "task" --skill-id skill-123 # Enable skills (repeatable)
30browser-use -b remote run "task" --secret key=value # Secret metadata (repeatable)
31
32# Structured output and evaluation
33browser-use -b remote run "task" --structured-output '{"type":"object"}' # JSON schema for output
34browser-use -b remote run "task" --judge # Enable judge mode
35browser-use -b remote run "task" --judge-ground-truth "expected answer"
Task Management
bash
1browser-use task list # List recent tasks
2browser-use task list --limit 20 # Show more tasks
3browser-use task list --status finished # Filter by status (finished, stopped)
4browser-use task list --session <id> # Filter by session ID
5browser-use task list --json # JSON output
6
7browser-use task status <task-id> # Get task status (latest step only)
8browser-use task status <task-id> -c # All steps with reasoning
9browser-use task status <task-id> -v # All steps with URLs + actions
10browser-use task status <task-id> --last 5 # Last N steps only
11browser-use task status <task-id> --step 3 # Specific step number
12browser-use task status <task-id> --reverse # Newest first
13
14browser-use task stop <task-id> # Stop a running task
15browser-use task logs <task-id> # Get task execution logs
Cloud Session Management
bash
1browser-use session list # List cloud sessions
2browser-use session list --limit 20 # Show more sessions
3browser-use session list --status active # Filter by status
4browser-use session list --json # JSON output
5
6browser-use session get <session-id> # Get session details + live URL
7browser-use session get <session-id> --json
8
9browser-use session stop <session-id> # Stop a session
10browser-use session stop --all # Stop all active sessions
11
12browser-use session create # Create with defaults
13browser-use session create --profile <id> # With cloud profile
14browser-use session create --proxy-country uk # With geographic proxy
15browser-use session create --start-url https://example.com
16browser-use session create --screen-size 1920x1080
17browser-use session create --keep-alive
18browser-use session create --persist-memory
19
20browser-use session share <session-id> # Create public share URL
21browser-use session share <session-id> --delete # Delete public share
Tunnels
bash
1browser-use tunnel <port> # Start tunnel (returns URL)
2browser-use tunnel <port> # Idempotent - returns existing URL
3browser-use tunnel list # Show active tunnels
4browser-use tunnel stop <port> # Stop tunnel
5browser-use tunnel stop --all # Stop all tunnels
Session Management
bash
1browser-use sessions # List active sessions
2browser-use close # Close current session
3browser-use close --all # Close all sessions
Profile Management
Local Chrome Profiles (--browser real)
bash
1browser-use -b real profile list # List local Chrome profiles
2browser-use -b real profile cookies "Default" # Show cookie domains in profile
Cloud Profiles (--browser remote)
bash
1browser-use -b remote profile list # List cloud profiles
2browser-use -b remote profile list --page 2 --page-size 50
3browser-use -b remote profile get <id> # Get profile details
4browser-use -b remote profile create # Create new cloud profile
5browser-use -b remote profile create --name "My Profile"
6browser-use -b remote profile update <id> --name "New"
7browser-use -b remote profile delete <id>
Syncing
bash
1browser-use profile sync --from "Default" --domain github.com # Domain-specific
2browser-use profile sync --from "Default" # Full profile
3browser-use profile sync --from "Default" --name "Custom Name" # With custom name
Server Control
bash
1browser-use server logs # View server logs
Common Workflows
Exposing Local Dev Servers
Use when you have a local dev server and need a cloud browser to reach it.
Core workflow: Start dev server → create tunnel → browse the tunnel URL remotely.
bash
1# 1. Start your dev server
2npm run dev & # localhost:3000
3
4# 2. Expose it via Cloudflare tunnel
5browser-use tunnel 3000
6# → url: https://abc.trycloudflare.com
7
8# 3. Now the cloud browser can reach your local server
9browser-use --browser remote open https://abc.trycloudflare.com
10browser-use state
11browser-use screenshot
Note: Tunnels are independent of browser sessions. They persist across browser-use close and can be managed separately. Cloudflared must be installed — run browser-use doctor to check.
Authenticated Browsing with Profiles
Use when a task requires browsing a site the user is already logged into (e.g. Gmail, GitHub, internal tools).
Core workflow: Check existing profiles → ask user which profile and browser mode → browse with that profile. Only sync cookies if no suitable profile exists.
IMPORTANT: Cookie-First for OAuth Sites
For sites that use Google OAuth or similar (Notion, Slack, etc.), always prefer cookie import over automated login:
- Frequent automated OAuth attempts trigger rate limiting and account security alerts
- Google may block the account after too many automated login attempts
- Saved cookies typically last 30 days and avoid all login issues
bash
1# Preferred: Use saved cookies
2IN_DOCKER=true browser-use --session mysite cookies import ~/.mysite/cookies.json
3IN_DOCKER=true browser-use --session mysite open "https://mysite.com"
4
5# After manual login, save cookies for future use
6IN_DOCKER=true browser-use --session mysite cookies export ~/.mysite/cookies.json
Before browsing an authenticated site, the agent MUST:
- Ask the user whether to use real (local Chrome) or remote (cloud) browser
- List available profiles for that mode
- Ask which profile to use
- If no profile has the right cookies, offer to sync (see below)
Step 1: Check existing profiles
bash
1# Option A: Local Chrome profiles (--browser real)
2browser-use -b real profile list
3# → Default: Person 1 (user@gmail.com)
4# → Profile 1: Work (work@company.com)
5
6# Option B: Cloud profiles (--browser remote)
7browser-use -b remote profile list
8# → abc-123: "Chrome - Default (github.com)"
9# → def-456: "Work profile"
Step 2: Browse with the chosen profile
bash
1# Real browser — uses local Chrome with existing login sessions
2browser-use --browser real --profile "Default" open https://github.com
3
4# Cloud browser — uses cloud profile with synced cookies
5browser-use --browser remote --profile abc-123 open https://github.com
The user is already authenticated — no login needed.
Note: Cloud profile cookies can expire over time. If authentication fails, re-sync cookies from the local Chrome profile.
Step 3: Syncing cookies (only if needed)
If the user wants to use a cloud browser but no cloud profile has the right cookies, sync them from a local Chrome profile.
Before syncing, the agent MUST:
- Ask which local Chrome profile to use
- Ask which domain(s) to sync — do NOT default to syncing the full profile
- Confirm before proceeding
Check what cookies a local profile has:
bash
1browser-use -b real profile cookies "Default"
2# → youtube.com: 23
3# → google.com: 18
4# → github.com: 2
Domain-specific sync (recommended):
bash
1browser-use profile sync --from "Default" --domain github.com
2# Creates new cloud profile: "Chrome - Default (github.com)"
3# Only syncs github.com cookies
Full profile sync (use with caution):
bash
1browser-use profile sync --from "Default"
2# Syncs ALL cookies — includes sensitive data, tracking cookies, every session token
Only use when the user explicitly needs their entire browser state.
Fine-grained control (advanced):
bash
1# Export cookies to file, manually edit, then import
2browser-use --browser real --profile "Default" cookies export /tmp/cookies.json
3browser-use --browser remote --profile <id> cookies import /tmp/cookies.json
Use the synced profile:
bash
1browser-use --browser remote --profile <id> open https://github.com
Running Subagents
Use cloud sessions to run autonomous browser agents in parallel.
Core workflow: Launch task(s) with run → poll with task status → collect results → clean up sessions.
- Session = Agent: Each cloud session is a browser agent with its own state
- Task = Work: Jobs given to an agent; an agent can run multiple tasks sequentially
- Session lifecycle: Once stopped, a session cannot be revived — start a new one
Launching Tasks
bash
1# Single task (async by default — returns immediately)
2browser-use -b remote run "Search for AI news and summarize top 3 articles"
3# → task_id: task-abc, session_id: sess-123
4
5# Parallel tasks — each gets its own session
6browser-use -b remote run "Research competitor A pricing"
7# → task_id: task-1, session_id: sess-a
8browser-use -b remote run "Research competitor B pricing"
9# → task_id: task-2, session_id: sess-b
10browser-use -b remote run "Research competitor C pricing"
11# → task_id: task-3, session_id: sess-c
12
13# Sequential tasks in same session (reuses cookies, login state, etc.)
14browser-use -b remote run "Log into example.com" --keep-alive
15# → task_id: task-1, session_id: sess-123
16browser-use task status task-1 # Wait for completion
17browser-use -b remote run "Export settings" --session-id sess-123
18# → task_id: task-2, session_id: sess-123 (same session)
Managing & Stopping
bash
1browser-use task list --status finished # See completed tasks
2browser-use task stop task-abc # Stop a task (session may continue if --keep-alive)
3browser-use session stop sess-123 # Stop an entire session (terminates its tasks)
4browser-use session stop --all # Stop all sessions
Monitoring
Task status is designed for token efficiency. Default output is minimal — only expand when needed:
| Mode | Flag | Tokens | Use When |
|---|
| Default | (none) | Low | Polling progress |
| Compact | -c | Medium | Need full reasoning |
| Verbose | -v | High | Debugging actions |
bash
1# For long tasks (50+ steps)
2browser-use task status <id> -c --last 5 # Last 5 steps only
3browser-use task status <id> -v --step 10 # Inspect specific step
Live view: browser-use session get <session-id> returns a live URL to watch the agent.
Detect stuck tasks: If cost/duration in task status stops increasing, the task is stuck — stop it and start a new agent.
Logs: browser-use task logs <task-id> — only available after task completes.
Global Options
| Option | Description |
|---|
--session NAME | Use named session (default: "default") |
--browser MODE | Browser mode: chromium, real, remote |
--headed | Show browser window (chromium mode) |
--profile NAME | Browser profile (local name or cloud ID). Works with open, session create, etc. — does NOT work with run (use --session-id instead) |
--json | Output as JSON |
--mcp | Run as MCP server via stdin/stdout |
Session behavior: All commands without --session use the same "default" session. The browser stays open and is reused across commands. Use --session NAME to run multiple browsers in parallel.
Tips
- Always run
browser-use state first to see available elements and their indices
- Use
--headed for debugging to see what the browser is doing
- Sessions persist — the browser stays open between commands
- Use
--json for programmatic parsing
- Python variables persist across
browser-use python commands within a session
- CLI aliases:
bu, browser, and browseruse all work identically to browser-use
- Cookie-first authentication — For sites with OAuth (Google, etc.), prefer importing saved cookies over automated login to avoid rate limiting
- Save state for debugging — Run
browser-use state > /tmp/debug_state.txt to save state for regex testing
Troubleshooting
Run diagnostics first:
CRITICAL: WSL/Docker/Root Environment
If running as root or in WSL/Docker, you MUST set IN_DOCKER=true:
bash
1IN_DOCKER=true browser-use open "https://example.com"
Without this, the browser will timeout after 30 seconds because Chrome/Chromium requires --no-sandbox when running as root. The IN_DOCKER=true environment variable tells browser-use to add this flag automatically.
Symptoms of missing IN_DOCKER=true:
- Browser startup hangs for 30 seconds then fails
- Error: "Browser failed to start within timeout"
- Works fine as non-root user but fails as root
Browser won't start?
bash
1browser-use close --all # Close all sessions
2IN_DOCKER=true browser-use --headed open <url> # Try with IN_DOCKER flag
3browser-use --headed open <url> # Try with visible window (if not root)
4
5**Element not found?**
6```bash
7browser-use state # Check current elements
8browser-use scroll down # Element might be below fold
9browser-use state # Check again
Session issues?
bash
1browser-use sessions # Check active sessions
2browser-use close --all # Clean slate
3browser-use open <url> # Fresh start
Session reuse fails after task stop:
If you stop a task and try to reuse its session, the new task may get stuck at "created" status. Create a new session instead:
bash
1browser-use session create --profile <profile-id> --keep-alive
2browser-use -b remote run "new task" --session-id <new-session-id>
Task stuck at "started": Check cost with task status — if not increasing, the task is stuck. View live URL with session get, then stop and start a new agent.
Sessions persist after tasks complete: Tasks finishing doesn't auto-stop sessions. Run browser-use session stop --all to clean up.
Cleanup
Always close the browser when done:
bash
1browser-use close # Close browser session
2browser-use session stop --all # Stop cloud sessions (if any)
3browser-use tunnel stop --all # Stop tunnels (if any)