⚠ The primary deployment is now the Hetzner cloud host. A single always-on BrowserOS container runs on a Hetzner VPS, reachable from Claude Desktop over autossh tunnels. See
cloud/SETUP.mdanddocs/hetzner-reference.mdto set it up or operate it.The local 3-container Mac setup documented below is legacy — kept for reference and one-off local use. It no longer auto-starts at login (the
make-stable.sh/ launchd autostart machinery was removed); bring it up manually withdocker compose up -dwhen you actually want it.
Use a real browser remotely, from your local Claude Desktop, over MCP. Self-hosted: N independent BrowserOS instances in Docker, each exposing its built-in MCP server, all wired into Claude Desktop as separate connectors. You log into your accounts once via a browser-based live-view, profile state persists, and Claude drives a real Chromium with your sessions intact.
This repo replicates the spirit of Browser Use Cloud, but the agent is Anthropic's MCP-driven model running through BrowserOS — no separate per-token billing, no rented infra. You own the bytes.
┌─ your laptop ─────────┐ ┌─ host running Docker ────────────────────┐
│ Claude Desktop │ │ cbm-browseros-1 │
│ mcpServers: │ ─ stdio ─ npx mcp-remote ──► │ Xvfb + noVNC + Chromium-fork │
│ browseros-1: ──┐ │ http://host:9201 │ /data/1/profile ← cookies, history… │
│ browseros-2: ──┼──┼───────────► :9202 ────────► │ cbm-browseros-2 │
│ browseros-3: ──┘ │ :9203 ────────► │ cbm-browseros-3 │
│ │ │ restart: unless-stopped │
│ Web browser │ ─ live-view (noVNC) ──────► │ noVNC ports :6081, :6082, :6083 │
└───────────────────────┘ └──────────────────────────────────────────┘
| Slot | MCP port | noVNC | CDP | Profile dir |
|---|---|---|---|---|
| 1 | 9201 | http://localhost:6081/ | 9111 | ./data/1 |
| 2 | 9202 | http://localhost:6082/ | 9112 | ./data/2 |
| 3 | 9203 | http://localhost:6083/ | 9113 | ./data/3 |
Each slot is fully isolated: separate profile, separate logins, separate tabs. At the start of a Claude Desktop chat you say "for this conversation use browseros-2" and that chat is bound to slot 2.
# Docker Desktop — https://docs.docker.com/desktop/install/mac-install/
# Node.js (for npx mcp-remote)
brew install node
# GitHub Desktop / git, plus Claude Desktop installedgit clone https://github.com/r-sayar/cloud-browser-mcp.git ~/cloud-browser-mcp
cd ~/cloud-browser-mcp# Get it from https://github.com/browseros-ai/BrowserOS/releases (Linux .AppImage)
# ~280 MB; the repo's .gitignore excludes it so it stays out of git.
cp ~/Downloads/BrowserOS.AppImage .docker compose up -d --build # ~3 min on first run (apt + AppImage extract)Verify all three slots are healthy:
for p in 9201 9202 9203; do echo -n "$p: "; curl -s http://localhost:$p/health; echo; done
# → 9201: {"status":"ok","cdpConnected":true}
# → 9202: {"status":"ok","cdpConnected":true}
# → 9203: {"status":"ok","cdpConnected":true}Edit ~/Library/Application Support/Claude/claude_desktop_config.json —
add the mcpServers block (top level, alongside any preferences):
{
"mcpServers": {
"browseros-1": { "command": "npx", "args": ["-y", "mcp-remote", "http://localhost:9201/mcp"] },
"browseros-2": { "command": "npx", "args": ["-y", "mcp-remote", "http://localhost:9202/mcp"] },
"browseros-3": { "command": "npx", "args": ["-y", "mcp-remote", "http://localhost:9203/mcp"] }
}
}Why
mcp-remote? Claude Desktop's "Connectors" UI requires HTTPS, butmcp-remoteruns as a stdio MCP subprocess that forwards to any HTTP URL — bypassing the HTTPS check while keeping the wire protocol identical.
⌘Q Claude Desktop fully (not just close window) and reopen. The three connectors will show up in the tool picker.
Open the noVNC URL for whichever slot you want to set up — for example http://localhost:6081/ for slot 1 — in your laptop browser. You'll see the cloud Chromium desktop. Click into the page; keys + mouse work.
Inside that cloud browser:
- Sign into Google (or whichever sites you want the agent to act on).
- Sign into BrowserOS itself if it prompts (its own account for the Klavis 40+ external-service integrations).
Profile state is written to ./data/<slot>/profile/ on your host. Survives
container restarts, repo backup, etc. Each slot has its own profile —
slot 1's logins are invisible to slot 2.
In any Claude Desktop chat, say:
"For this conversation, use browseros-2. Open my Gmail and tell me how many unread messages I have."
Claude will pick the slot-2 toolset, navigate gmail.com, take a snapshot, count. Open another chat in parallel; tell it "use browseros-3". Fully isolated, no interference.
The local stack no longer starts at login. The previous launchd autostart
(make-stable.sh, launchd-startup.sh, and the
com.cloud-agents.browseros.plist LaunchAgent) has been removed — the always-on
deployment is now the Hetzner host, not your laptop.
Bring the local stack up by hand only when you need it:
docker compose up -dThe Hetzner host's lifecycle, tunnels, and MCP wiring are documented in
docs/hetzner-reference.md.
Edit docker-compose.yml. Each slot is one service block; add a fourth:
browseros-4:
<<: *browseros-defaults
container_name: cbm-browseros-4
depends_on: [browseros-1]
ports: ["9204:9200", "9114:9011", "6084:6080"]
volumes: ["./data/4:/data"]Then add a fourth mcpServers entry in Claude Desktop config and ⌘Q + reopen.
Resource budget: ~600 MB RAM per idle browser, can spike to 2 GB. Three is comfortable on 8 GB; four+ wants 12 GB+.
The single-machine local setup above works for personal use. If you want it on a real server, the cheapest paths in order:
- Self-host on any spare hardware (Pi 4+, NUC, Mac mini, old laptop) + Tailscale for remote access. Free; ~$1/mo electricity.
- Hetzner CPX31 (€16.49/mo, 4 vCPU AMD, 8 GB) + Tailscale. End-to-end
walkthrough in
cloud/SETUP.md, including a paste-readycloud-init.yamlthat bootstraps Docker + Tailscale + ufw, and acloud/deploy.shthat rsyncs the repo + (optionally) your logged-in profiles, then runsdocker compose up -d --buildon the remote. - Public HTTPS via Caddy in front of the stack, with bearer-token auth. Sketched in BROWSEROS.md. Only do this if you need collaborators on a different network than yours.
After option 1 or 2, point Claude Desktop's MCP URLs at the tailnet host:
http://<tailscale-ip>:9201/mcp etc.
Vanilla mcp__browseros-N__* exposes 60+ low-level browser primitives
(take_snapshot, click, fill). For sites you use a lot, you want a
high-level surface instead — gmail_compose(to, subject, body) rather
than "snapshot, find compose button, click, wait, snapshot, fill To, ...".
Fifteen reference site MCPs ship in this repo. Each is ~100-300 lines,
shares a tiny BrowserOS-HTTP client (mcp_lib/), and exposes a small
high-level tool surface:
| Site MCP | Tools | Notes |
|---|---|---|
gmail_mcp/ |
7 | URL-driven compose; compose+send ~1.9s |
claude_ai_mcp/ |
7 | Drive another claude.ai session |
outlook_mcp/ |
7 | Office365 / generic Outlook Web |
canvas_mcp/ |
6 | UC Davis Instructure LMS, read-only |
fu_berlin_mcp/ |
6 | FU Berlin ZEDAT-Webmail (SquirrelMail) — different from Outlook |
amazon_mcp/ |
6 | Search, view, cart, orders, wishlist. NO place_order (financial) |
youtube_mcp/ |
5 | Search, watch, transcript, subs, watchlater |
notion_mcp/ |
5 | Search, recent, open, create, append |
linkedin_mcp/ |
4 | Search people, view profile, list/send messages |
luma_mcp/ |
3 | List upcoming, view event, RSVP |
wikipedia_mcp/ |
2 | Public, no auth |
pubmed_mcp/ |
2 | Public, NCBI |
calendly_mcp/ |
2 | List event types + scheduled meetings |
skyscanner_mcp/ |
2 | URL-driven flight search. NO book_flight (financial) |
zoom_mcp/ |
2 | List upcoming meetings + recordings |
Each tool is a cached deterministic recipe that calls BrowserOS once with a hardcoded JS or URL payload — no runtime LLM reasoning, no per-call selector discovery, no large snapshots in the agent context.
Every tool also returns a next_actions array hinting at legal follow-ups,
so the agent doesn't have to re-derive the affordance graph after each call.
One shared venv at the repo root drives all four servers:
python3 -m venv .venv
.venv/bin/pip install -r requirements.txtThen add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"gmail": { "command": "/abs/path/cloud-browser-mcp/.venv/bin/python", "args": ["/abs/path/cloud-browser-mcp/gmail_mcp/server.py"], "env": {"BROWSEROS_URL": "http://localhost:9201/mcp"} },
"claude-ai": { "command": "/abs/path/cloud-browser-mcp/.venv/bin/python", "args": ["/abs/path/cloud-browser-mcp/claude_ai_mcp/server.py"], "env": {"BROWSEROS_URL": "http://localhost:9201/mcp"} },
"outlook": { "command": "/abs/path/cloud-browser-mcp/.venv/bin/python", "args": ["/abs/path/cloud-browser-mcp/outlook_mcp/server.py"], "env": {"BROWSEROS_URL": "http://localhost:9201/mcp"} },
"canvas": { "command": "/abs/path/cloud-browser-mcp/.venv/bin/python", "args": ["/abs/path/cloud-browser-mcp/canvas_mcp/server.py"], "env": {"BROWSEROS_URL": "http://localhost:9201/mcp"} }
}
}Sign into each site once via http://localhost:6081/, ⌘Q + reopen Claude Desktop, and the tools appear. Try "Using the gmail MCP, find the most recent email from my professor and summarize it."
| Path | Wall-clock | Tool calls |
|---|---|---|
Vanilla mcp__browseros-1__* (snapshot+click) |
~110 s | 6 (incl. ~8K-token snapshot) |
| gmail_compose v1 (Compose-button recipe) | ~27 s | 1 |
| gmail_compose current (URL-driven + Meta+Enter) | ~1.9 s | 1 |
Read docs/PROTOCOL.md — the 7-step recipe for taking a
site from "open in cloud browser" to "working high-level MCP" in about 90
minutes. The shared client in mcp_lib/ makes new servers ~150 lines.
dashboard.html is a self-contained dark-mode page that embeds all three
slot's noVNC views side-by-side, with health indicators per slot. Two ways
to run it:
# Plain static (no container control — just embeds noVNC):
python3 -m http.server 5173
# With container start/stop control (▶ / ⏻ buttons per tile):
python3 scripts/dashboard_server.py --port 5173Then open http://localhost:5173/.
The control-plane server binds to 127.0.0.1 only and shells out to
docker compose for start/stop. Status polls every 5 s; tooltips show the
underlying state (running / stopped / missing / docker-daemon error).
(VS Code / Cursor users: hit F5 — .claude/launch.json runs the
control-plane server.)
scripts/import_cookies.py reads cookies from your laptop's Chrome and pushes
them into a slot's browser via CDP. Useful for sites that don't fingerprint-bind
cookies (banks/Google/Apple are blacklisted by default).
pip install -r scripts/requirements.txt
docker cp scripts/import_cookies.py cbm-browseros-1:/tmp/
docker compose exec browseros-1 pip install browser-cookie3 websockets requests --quiet
docker compose exec browseros-1 python3 /tmp/import_cookies.py twitter.com github.comCDP from the host hits a Chromium Host-header check that we'd need an HTTP-aware proxy to fix; running the script from inside the container sidesteps it. MCP path is unaffected.
docker compose down
./scripts/backup.sh # → profile-<timestamp>.tgz
./scripts/restore.sh profile-…tgz # restores to ./data
docker compose up -dTwo categories of hard problems this project is actively working through.
- Task completion — agents reliably finishing any web job end-to-end. Baseline working.
- Failure → playbook — when an agent can't complete a task, ask for help once, record those steps, never ask again. Just started.
- Token efficiency — completing tasks with minimal context so they're cheap enough to run routinely. Ongoing.
- Speed threshold — being fast enough that you'd use the agent instead of just doing it yourself. No benchmark yet — what latency actually crosses that threshold probably needs measurement before this is a real goal.
- Goal persistence — agents that keep running until a goal is achieved, surviving errors, restarts, and ambiguity.
- Steerability — injecting new instructions or corrections into a running agent without restarting it.
- Monitorability — knowing what a running agent is doing, has done, and is about to do, at a glance.
Issues and PRs welcome. The wrapper code is small and intentionally unopinionated — most improvements should land cleanly. If you're adding a new feature, please update both the README and at least one of the existing smoke tests so reviewers can confirm the shape of the change quickly.
.
├── Dockerfile # Debian + Xvfb + noVNC + extracted AppImage
├── entrypoint.sh # Xvfb → x11vnc → noVNC → socat → BrowserOS
├── docker-compose.yml # 3 services with YAML anchor for shared defaults
├── BrowserOS.AppImage # gitignored — drop in from BrowserOS releases
├── data/{1,2,3}/profile/ # Persistent BrowserOS profiles (cookies etc.)
├── dashboard.html # Self-contained 3-up noVNC dashboard
├── gmail_mcp/ # Reference site-as-MCP (7 Gmail tools)
│ ├── server.py
│ ├── requirements.txt
│ └── .venv/ # gitignored — created by `python -m venv .venv`
├── docs/
│ └── PROTOCOL.md # 7-step recipe for wrapping any site as MCP
├── cloud/ # ★ primary deployment (Hetzner)
│ ├── SETUP.md # Hetzner CPX31 + Tailscale walkthrough
│ ├── cloud-init.yaml # Server bootstrap (Docker + Tailscale + ufw)
│ └── deploy.sh # Rsync repo to remote + compose up
├── scripts/
│ ├── import_cookies.py # cookie fast-path (run inside container)
│ ├── backup.sh / restore.sh # tar/untar of ./data
│ └── requirements.txt
├── BROWSEROS.md # multi-tenant cloud-deploy design
├── SECURITY.md # threat model + 4 deployment patterns
├── LICENSE # MIT (wrapper code only; BrowserOS has its own)
├── legacy/steel/ # earlier Steel + Chromium prototype, kept for reference
└── README.md
The MCP and noVNC services in this repo ship with no authentication. That
is fine for localhost on your laptop — and dangerous the moment any port is
reachable from outside your machine.
Read SECURITY.md before you put this on anything other than
your own laptop. Short version: lock to 127.0.0.1 only, use Tailscale for
remote access, or put Caddy + bearer-token auth in front for a public URL.
- BrowserOS Max plan. Some BrowserOS features (the LLM-powered ones it ships with) need a BrowserOS account; the MCP works without one.
- CDP from host is broken (Chromium Host-header check via socat).
Workaround:
docker compose exec browseros-N. Doesn't affect MCP. --no-sandboxin container. Required because Chromium's namespace sandbox needs userns capabilities Docker rarely exposes. Standard practice for containerized Chromium; don't expose the container to untrusted input.- Resource use. Three idle browsers ~1.8 GB; spike to 5–6 GB under load.
shm_size: 2gbper service is sized for that.
The wrapper code in this repo is MIT. The bundled BrowserOS binary is governed by BrowserOS's own license (AGPLv3 + Ungoogled Chromium BSD).