Agent Browser Protocol
Web browsing is continuous and async. Agents think in tools and steps. ABP reformats web navigation into the discrete, multimodal chat format agents know and love.
90.53% on Online Mind2Web — reproducible results
ABP is a Chromium build with MCP + REST baked directly into the browser engine.
One request = one completed step : settled state + screenshot + event log
: settled state + screenshot + event log No WebSocket. No CDP session management. Just HTTP.
Just HTTP. ~100ms overhead per action (including screenshots). The bottleneck is the LLM, not the browser.
Try it in 60 seconds (Claude Code) # 1) Add ABP as an MCP server to Claude Code claude mcp add browser -- npx -y agent-browser-protocol --mcp # 2) Sanity check the server is up (optional) curl -s http://localhost:8222/api/v1/tabs Wait for the browser to launch and ask Claude: “Find me kung pao chicken near 415 Mission St, San Francisco on Doordash.” What you should notice: every tool call returns a settled page state (screenshot + events), and the page freezes between steps so Claude never races the browser.
What you get per action
AI Agent ABP Chromium │ │ │ POST /click (x=450, y=320) │ │────────────────────────────────────────>│ │ │ Inject real input event │ │ Wait for page to settle │ │ Capture compositor screenshot │ │ Collect events (tab_created, dialog, file_chooser…) │ │ Pause JavaScript + virtual time │ 200 OK: screenshot + events │ │<────────────────────────────────────────│ │ · (agent inspects screenshot, decides) · │ │ POST /type (text="Show HN") │ │────────────────────────────────────────>│ │ │ Unpause JS + time │ │ Inject real keyboard events │ │ Wait for settle → screenshot → events → pause │ 200 OK: screenshot + events │ │<────────────────────────────────────────│
... continue reading