cloud-browser-mcp

⚠ The primary deployment is now the Hetzner cloud host. A single always-on BrowserOS container runs on a Hetzner VPS, reachable from Claude Desktop over autossh tunnels. See cloud/SETUP.md and docs/hetzner-reference.md to set it up or operate it.

The local 3-container Mac setup documented below is legacy — kept for reference and one-off local use. It no longer auto-starts at login (the make-stable.sh / launchd autostart machinery was removed); bring it up manually with docker compose up -d when you actually want it.

Use a real browser remotely, from your local Claude Desktop, over MCP. Self-hosted: N independent BrowserOS instances in Docker, each exposing its built-in MCP server, all wired into Claude Desktop as separate connectors. You log into your accounts once via a browser-based live-view, profile state persists, and Claude drives a real Chromium with your sessions intact.

This repo replicates the spirit of Browser Use Cloud, but the agent is Anthropic's MCP-driven model running through BrowserOS — no separate per-token billing, no rented infra. You own the bytes.

Architecture

┌─ your laptop ─────────┐                              ┌─ host running Docker ────────────────────┐
│  Claude Desktop       │                              │  cbm-browseros-1                │
│   mcpServers:         │ ─ stdio ─ npx mcp-remote ──► │   Xvfb + noVNC + Chromium-fork           │
│     browseros-1: ──┐  │           http://host:9201  │   /data/1/profile  ← cookies, history…   │
│     browseros-2: ──┼──┼───────────► :9202 ────────► │  cbm-browseros-2                │
│     browseros-3: ──┘  │             :9203 ────────► │  cbm-browseros-3                │
│                       │                              │   restart: unless-stopped                │
│  Web browser          │ ─ live-view (noVNC) ──────► │   noVNC ports :6081, :6082, :6083        │
└───────────────────────┘                              └──────────────────────────────────────────┘

Slot	MCP port	noVNC	CDP	Profile dir
1	9201	http://localhost:6081/	9111	`./data/1`
2	9202	http://localhost:6082/	9112	`./data/2`
3	9203	http://localhost:6083/	9113	`./data/3`

Each slot is fully isolated: separate profile, separate logins, separate tabs. At the start of a Claude Desktop chat you say "for this conversation use browseros-2" and that chat is bound to slot 2.

Full setup (Mac, ~15 minutes)

1. Prerequisites

# Docker Desktop — https://docs.docker.com/desktop/install/mac-install/
# Node.js (for npx mcp-remote)
brew install node
# GitHub Desktop / git, plus Claude Desktop installed

2. Clone the repo

git clone https://github.com/r-sayar/cloud-browser-mcp.git ~/cloud-browser-mcp
cd ~/cloud-browser-mcp

3. Drop in the BrowserOS Linux AppImage

# Get it from https://github.com/browseros-ai/BrowserOS/releases (Linux .AppImage)
# ~280 MB; the repo's .gitignore excludes it so it stays out of git.
cp ~/Downloads/BrowserOS.AppImage .

4. Bring up the stack

docker compose up -d --build       # ~3 min on first run (apt + AppImage extract)

Verify all three slots are healthy:

for p in 9201 9202 9203; do echo -n "$p: "; curl -s http://localhost:$p/health; echo; done
# → 9201: {"status":"ok","cdpConnected":true}
# → 9202: {"status":"ok","cdpConnected":true}
# → 9203: {"status":"ok","cdpConnected":true}

5. Wire the connectors into Claude Desktop

Edit ~/Library/Application Support/Claude/claude_desktop_config.json — add the mcpServers block (top level, alongside any preferences):

{
  "mcpServers": {
    "browseros-1": { "command": "npx", "args": ["-y", "mcp-remote", "http://localhost:9201/mcp"] },
    "browseros-2": { "command": "npx", "args": ["-y", "mcp-remote", "http://localhost:9202/mcp"] },
    "browseros-3": { "command": "npx", "args": ["-y", "mcp-remote", "http://localhost:9203/mcp"] }
  }
}

Why mcp-remote? Claude Desktop's "Connectors" UI requires HTTPS, but mcp-remote runs as a stdio MCP subprocess that forwards to any HTTP URL — bypassing the HTTPS check while keeping the wire protocol identical.

⌘Q Claude Desktop fully (not just close window) and reopen. The three connectors will show up in the tool picker.

6. Offline auth (per slot, one-time)

Open the noVNC URL for whichever slot you want to set up — for example http://localhost:6081/ for slot 1 — in your laptop browser. You'll see the cloud Chromium desktop. Click into the page; keys + mouse work.

Inside that cloud browser:

Sign into Google (or whichever sites you want the agent to act on).
Sign into BrowserOS itself if it prompts (its own account for the Klavis 40+ external-service integrations).

Profile state is written to ./data/<slot>/profile/ on your host. Survives container restarts, repo backup, etc. Each slot has its own profile — slot 1's logins are invisible to slot 2.

7. Use it

In any Claude Desktop chat, say:

"For this conversation, use browseros-2. Open my Gmail and tell me how many unread messages I have."

Claude will pick the slot-2 toolset, navigate gmail.com, take a snapshot, count. Open another chat in parallel; tell it "use browseros-3". Fully isolated, no interference.

Auto-start (removed)

The local stack no longer starts at login. The previous launchd autostart (make-stable.sh, launchd-startup.sh, and the com.cloud-agents.browseros.plist LaunchAgent) has been removed — the always-on deployment is now the Hetzner host, not your laptop.

Bring the local stack up by hand only when you need it:

docker compose up -d

The Hetzner host's lifecycle, tunnels, and MCP wiring are documented in docs/hetzner-reference.md.

Adding more browsers

Edit docker-compose.yml. Each slot is one service block; add a fourth:

  browseros-4:
    <<: *browseros-defaults
    container_name: cbm-browseros-4
    depends_on: [browseros-1]
    ports: ["9204:9200", "9114:9011", "6084:6080"]
    volumes: ["./data/4:/data"]

Then add a fourth mcpServers entry in Claude Desktop config and ⌘Q + reopen.

Resource budget: ~600 MB RAM per idle browser, can spike to 2 GB. Three is comfortable on 8 GB; four+ wants 12 GB+.

Running on a remote host (optional)

The single-machine local setup above works for personal use. If you want it on a real server, the cheapest paths in order:

Self-host on any spare hardware (Pi 4+, NUC, Mac mini, old laptop) + Tailscale for remote access. Free; ~$1/mo electricity.
Hetzner CPX31 (€16.49/mo, 4 vCPU AMD, 8 GB) + Tailscale. End-to-end walkthrough in cloud/SETUP.md, including a paste-ready cloud-init.yaml that bootstraps Docker + Tailscale + ufw, and a cloud/deploy.sh that rsyncs the repo + (optionally) your logged-in profiles, then runs docker compose up -d --build on the remote.
Public HTTPS via Caddy in front of the stack, with bearer-token auth. Sketched in BROWSEROS.md. Only do this if you need collaborators on a different network than yours.

After option 1 or 2, point Claude Desktop's MCP URLs at the tailnet host: http://<tailscale-ip>:9201/mcp etc.

Wrapping a site as a high-level MCP

Vanilla mcp__browseros-N__* exposes 60+ low-level browser primitives (take_snapshot, click, fill). For sites you use a lot, you want a high-level surface instead — gmail_compose(to, subject, body) rather than "snapshot, find compose button, click, wait, snapshot, fill To, ...".

Fifteen reference site MCPs ship in this repo. Each is ~100-300 lines, shares a tiny BrowserOS-HTTP client (mcp_lib/), and exposes a small high-level tool surface:

Site MCP	Tools	Notes
`gmail_mcp/`	7	URL-driven compose; `compose+send` ~1.9s
`claude_ai_mcp/`	7	Drive another claude.ai session
`outlook_mcp/`	7	Office365 / generic Outlook Web
`canvas_mcp/`	6	UC Davis Instructure LMS, read-only
`fu_berlin_mcp/`	6	FU Berlin ZEDAT-Webmail (SquirrelMail) — different from Outlook
`amazon_mcp/`	6	Search, view, cart, orders, wishlist. NO `place_order` (financial)
`youtube_mcp/`	5	Search, watch, transcript, subs, watchlater
`notion_mcp/`	5	Search, recent, open, create, append
`linkedin_mcp/`	4	Search people, view profile, list/send messages
`luma_mcp/`	3	List upcoming, view event, RSVP
`wikipedia_mcp/`	2	Public, no auth
`pubmed_mcp/`	2	Public, NCBI
`calendly_mcp/`	2	List event types + scheduled meetings
`skyscanner_mcp/`	2	URL-driven flight search. NO `book_flight` (financial)
`zoom_mcp/`	2	List upcoming meetings + recordings

Each tool is a cached deterministic recipe that calls BrowserOS once with a hardcoded JS or URL payload — no runtime LLM reasoning, no per-call selector discovery, no large snapshots in the agent context.

Every tool also returns a next_actions array hinting at legal follow-ups, so the agent doesn't have to re-derive the affordance graph after each call.

Wire them in

One shared venv at the repo root drives all four servers:

python3 -m venv .venv
.venv/bin/pip install -r requirements.txt

Then add to ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "gmail":     { "command": "/abs/path/cloud-browser-mcp/.venv/bin/python", "args": ["/abs/path/cloud-browser-mcp/gmail_mcp/server.py"],     "env": {"BROWSEROS_URL": "http://localhost:9201/mcp"} },
    "claude-ai": { "command": "/abs/path/cloud-browser-mcp/.venv/bin/python", "args": ["/abs/path/cloud-browser-mcp/claude_ai_mcp/server.py"], "env": {"BROWSEROS_URL": "http://localhost:9201/mcp"} },
    "outlook":   { "command": "/abs/path/cloud-browser-mcp/.venv/bin/python", "args": ["/abs/path/cloud-browser-mcp/outlook_mcp/server.py"],   "env": {"BROWSEROS_URL": "http://localhost:9201/mcp"} },
    "canvas":    { "command": "/abs/path/cloud-browser-mcp/.venv/bin/python", "args": ["/abs/path/cloud-browser-mcp/canvas_mcp/server.py"],    "env": {"BROWSEROS_URL": "http://localhost:9201/mcp"} }
  }
}

Sign into each site once via http://localhost:6081/, ⌘Q + reopen Claude Desktop, and the tools appear. Try "Using the gmail MCP, find the most recent email from my professor and summarize it."

Speed (Gmail compose, send=True)

Path	Wall-clock	Tool calls
Vanilla `mcp__browseros-1__*` (snapshot+click)	~110 s	6 (incl. ~8K-token snapshot)
gmail_compose v1 (Compose-button recipe)	~27 s	1
gmail_compose current (URL-driven + Meta+Enter)	~1.9 s	1

Build one for any site

Read docs/PROTOCOL.md — the 7-step recipe for taking a site from "open in cloud browser" to "working high-level MCP" in about 90 minutes. The shared client in mcp_lib/ makes new servers ~150 lines.

Browser dashboard

dashboard.html is a self-contained dark-mode page that embeds all three slot's noVNC views side-by-side, with health indicators per slot. Two ways to run it:

# Plain static (no container control — just embeds noVNC):
python3 -m http.server 5173

# With container start/stop control (▶ / ⏻ buttons per tile):
python3 scripts/dashboard_server.py --port 5173

Then open http://localhost:5173/.

The control-plane server binds to 127.0.0.1 only and shells out to docker compose for start/stop. Status polls every 5 s; tooltips show the underlying state (running / stopped / missing / docker-daemon error).

(VS Code / Cursor users: hit F5 — .claude/launch.json runs the control-plane server.)

Cookie-import fast-path (optional)

scripts/import_cookies.py reads cookies from your laptop's Chrome and pushes them into a slot's browser via CDP. Useful for sites that don't fingerprint-bind cookies (banks/Google/Apple are blacklisted by default).

pip install -r scripts/requirements.txt
docker cp scripts/import_cookies.py cbm-browseros-1:/tmp/
docker compose exec browseros-1 pip install browser-cookie3 websockets requests --quiet
docker compose exec browseros-1 python3 /tmp/import_cookies.py twitter.com github.com

CDP from the host hits a Chromium Host-header check that we'd need an HTTP-aware proxy to fix; running the script from inside the container sidesteps it. MCP path is unaffected.

Profile backup / restore

docker compose down
./scripts/backup.sh                          # → profile-<timestamp>.tgz
./scripts/restore.sh profile-…tgz            # restores to ./data
docker compose up -d

Open problems

Two categories of hard problems this project is actively working through.

Web agents

Task completion — agents reliably finishing any web job end-to-end. Baseline working.
Failure → playbook — when an agent can't complete a task, ask for help once, record those steps, never ask again. Just started.
Token efficiency — completing tasks with minimal context so they're cheap enough to run routinely. Ongoing.
Speed threshold — being fast enough that you'd use the agent instead of just doing it yourself. No benchmark yet — what latency actually crosses that threshold probably needs measurement before this is a real goal.

Persistent agents

Goal persistence — agents that keep running until a goal is achieved, surviving errors, restarts, and ambiguity.
Steerability — injecting new instructions or corrections into a running agent without restarting it.
Monitorability — knowing what a running agent is doing, has done, and is about to do, at a glance.

Contributing

Issues and PRs welcome. The wrapper code is small and intentionally unopinionated — most improvements should land cleanly. If you're adding a new feature, please update both the README and at least one of the existing smoke tests so reviewers can confirm the shape of the change quickly.

Layout

.
├── Dockerfile                  # Debian + Xvfb + noVNC + extracted AppImage
├── entrypoint.sh               # Xvfb → x11vnc → noVNC → socat → BrowserOS
├── docker-compose.yml          # 3 services with YAML anchor for shared defaults
├── BrowserOS.AppImage          # gitignored — drop in from BrowserOS releases
├── data/{1,2,3}/profile/       # Persistent BrowserOS profiles (cookies etc.)
├── dashboard.html              # Self-contained 3-up noVNC dashboard
├── gmail_mcp/                  # Reference site-as-MCP (7 Gmail tools)
│   ├── server.py
│   ├── requirements.txt
│   └── .venv/                  # gitignored — created by `python -m venv .venv`
├── docs/
│   └── PROTOCOL.md             # 7-step recipe for wrapping any site as MCP
├── cloud/                      # ★ primary deployment (Hetzner)
│   ├── SETUP.md                # Hetzner CPX31 + Tailscale walkthrough
│   ├── cloud-init.yaml         # Server bootstrap (Docker + Tailscale + ufw)
│   └── deploy.sh               # Rsync repo to remote + compose up
├── scripts/
│   ├── import_cookies.py       # cookie fast-path (run inside container)
│   ├── backup.sh / restore.sh  # tar/untar of ./data
│   └── requirements.txt
├── BROWSEROS.md                # multi-tenant cloud-deploy design
├── SECURITY.md                 # threat model + 4 deployment patterns
├── LICENSE                     # MIT (wrapper code only; BrowserOS has its own)
├── legacy/steel/               # earlier Steel + Chromium prototype, kept for reference
└── README.md

⚠ Security

The MCP and noVNC services in this repo ship with no authentication. That is fine for localhost on your laptop — and dangerous the moment any port is reachable from outside your machine.

Read SECURITY.md before you put this on anything other than your own laptop. Short version: lock to 127.0.0.1 only, use Tailscale for remote access, or put Caddy + bearer-token auth in front for a public URL.

Known caveats

BrowserOS Max plan. Some BrowserOS features (the LLM-powered ones it ships with) need a BrowserOS account; the MCP works without one.
CDP from host is broken (Chromium Host-header check via socat). Workaround: docker compose exec browseros-N. Doesn't affect MCP.
--no-sandbox in container. Required because Chromium's namespace sandbox needs userns capabilities Docker rarely exposes. Standard practice for containerized Chromium; don't expose the container to untrusted input.
Resource use. Three idle browsers ~1.8 GB; spike to 5–6 GB under load. shm_size: 2gb per service is sized for that.

License

The wrapper code in this repo is MIT. The bundled BrowserOS binary is governed by BrowserOS's own license (AGPLv3 + Ungoogled Chromium BSD).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cloud-browser-mcp

Architecture

Full setup (Mac, ~15 minutes)

1. Prerequisites

2. Clone the repo

3. Drop in the BrowserOS Linux AppImage

4. Bring up the stack

5. Wire the connectors into Claude Desktop

6. Offline auth (per slot, one-time)

7. Use it

Auto-start (removed)

Adding more browsers

Running on a remote host (optional)

Wrapping a site as a high-level MCP

Wire them in

Speed (Gmail compose, send=True)

Build one for any site

Browser dashboard

Cookie-import fast-path (optional)

Profile backup / restore

Open problems

Web agents

Persistent agents

Contributing

Layout

⚠ Security

Known caveats

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.claude		.claude
cloud		cloud
docs		docs
gmail_mcp		gmail_mcp
legacy/steel		legacy/steel
mcp_lib		mcp_lib
playbooks		playbooks
scripts		scripts
smart_browseros_mcp		smart_browseros_mcp
.dockerignore		.dockerignore
.gitignore		.gitignore
BROWSEROS.md		BROWSEROS.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
dashboard.html		dashboard.html
docker-compose.yml		docker-compose.yml
entrypoint.sh		entrypoint.sh
requirements.txt		requirements.txt
transplant_arc_auth.py		transplant_arc_auth.py

Folders and files

Latest commit

History

Repository files navigation

cloud-browser-mcp

Architecture

Full setup (Mac, ~15 minutes)

1. Prerequisites

2. Clone the repo

3. Drop in the BrowserOS Linux AppImage

4. Bring up the stack

5. Wire the connectors into Claude Desktop

6. Offline auth (per slot, one-time)

7. Use it

Auto-start (removed)

Adding more browsers

Running on a remote host (optional)

Wrapping a site as a high-level MCP

Wire them in

Speed (Gmail compose, send=True)

Build one for any site

Browser dashboard

Cookie-import fast-path (optional)

Profile backup / restore

Open problems

Web agents

Persistent agents

Contributing

Layout

⚠ Security

Known caveats

License

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages