Token cost + capability comparison: Enquire vs chrome-devtools-mcp vs agent-browser. Measured on example.com, extrapolated to richer pages.
Three stacks compared: Enquire MCP (this project), chrome-devtools-mcp
(Google's official MCP server), and agent-browser (Vercel's CLI tool).
Status: numbers measured via scripts/benchmark-mcp.mjs against example.com
(April 2026, Chrome 147, chrome-devtools-mcp v0.23, agent-browser v0.26).
Enquire numbers are bounded by error-path responses in this run — see
Reproducibility. chrome-devtools-mcp
and agent-browser numbers are clean measurements.
agent-browser comes out smallest in absolute tokens — its CLI shape avoids
MCP envelope overhead entirely. Enquire still beats chrome-devtools-mcp's
snapshot-based approach by 8.24× on click-and-read, even with errors
inflating its byte count.
Extrapolating to real pages where chrome-devtools-mcp's a11y snapshot
balloons:
Task class
chrome-devtools-mcp
Enquire (estimate)
agent-browser (estimate)
cdp/enq
Medium e-commerce flow (search → product → price)
~12,200 tok
~530 tok
~400 tok
23×
Heavy 10-step flow (checkout, multi-page form)
~30,000 tok
~800 tok
~600 tok
37×
At Claude Sonnet 4.6 prices, the e-commerce flow costs $0.081 vs $0.0035
(Enquire) vs $0.0026 (agent-browser) — $77/$78 saved per 1000 runs vs the
snapshot-y stack. The heavy flow saves ~$194/$196 per 1000 runs.
The 23–37× advantage isn't model magic — it comes from the design choice
"give back what was asked for" (Enquire and agent-browser) vs "give
back the whole page so the LLM can pick" (chrome-devtools-mcp).
chrome-devtools-mcp.take_snapshot() returns a complete a11y tree with uids
for every interactive element. Necessary the first time you see a page —
the LLM can't click an element it can't see. But once you know a selector,
that snapshot is ~3-5 KB of ballast that scrolls back into context on every
subsequent turn, multiplying input tokens.
Enquire.extract(selector, attribute) returns just the matched values.
Constant-size response regardless of page complexity. Costs more discovery
work upfront, but once a selector is known (or codified into a SKILL.md
procedure), every subsequent call is bounded.
agent-browser snapshot -i --json returns a similarly compact ref-keyed
tree (interactive elements only, by design). Looks structurally similar to
chrome-devtools-mcp's a11y tree but is much smaller because it filters to
interactive nodes only, then exposes them via stable @e1-style refs.
There's an active industry conversation about the design of these tools:
Vercel's agent-browser — npm v0.26, agent-browser.dev,
released 2026-04-16. CLI tool for AI agents. Bundles its own Chromium
via Playwright. Ships with a Claude Code Skill at
~/.claude/skills/agent-browser/SKILL.md that documents the command
surface so agents discover it without loading MCP tool schemas.
Microsoft's @playwright/cli — released early 2026 alongside
Playwright MCP. Explicit pitch: "token-efficient alternative to
Playwright MCP for AI coding agents."
The shared thesis: "CLI invocations are more token-efficient than MCP
because they avoid loading large tool schemas and verbose accessibility
trees into model context." Our measurements confirm this — agent-browser
beats every MCP-shaped option on tokens.
Where Enquire fits: Enquire is MCP-shaped, but uses targeted
extraction (extract, read_page) instead of full a11y snapshots. So
it gets close to CLI-level token efficiency without giving up MCP's
tool-schema discovery, against a logged-in real-user browser that
neither agent-browser nor chrome-devtools-mcp can offer.
The right comparison isn't Enquire vs MCP-tools-in-general — it's:
vs chrome-devtools-mcp: Enquire wins on tokens (8.24× cheaper) AND
on accessing logged-in browsers
vs agent-browser: Enquire loses ~30-50% on raw tokens but wins on
logged-in browser, MCP-native discovery, and procedure recording (SKILL.md)
not needed (driven via real user UI; if a target page raises one, missing)
chrome-devtools-mcp wins on breadth: network inspection, performance
profiling, lighthouse audits, screenshots — all instrumentation tasks that
Enquire doesn't aspire to cover. Enquire wins on directness: actions
target CSS selectors, no snapshot intermediate; tools fit a logged-in user's
real flows.
This was the live test that drove a redesign. The prior Enquire sign-in flow
called window.prompt() to collect a dev userid. That broke automation across
both stacks:
chrome-devtools-mcp: click tool times out after 5s waiting for the
dialog to close. Recovery requires a follow-up handle_dialog call. 3+ tool
calls minimum to sign in.
Manual users: closing the prompt without entering text silently rolled
back the operation, leaving the form looking signed-in (cloudMode: true)
but with authToken: "". Spent multiple debugging cycles on this.
Fix: replaced window.prompt() with an inline <input> rendered next to the
"Cloud" button (see lib/ui/sections/SettingsForm.tsx). Sign-in is now:
Click "Cloud" → input appears
Type userid, click Sign in → settings persist immediately
No dialog, no recovery, no race. The lesson: anything modal in your UI is a
landmine for automation. Inline forms are composable.
Input tokens: system prompt (cached, ~500 tok) + tool defs (cached) +
conversation history including all prior tool results
Output tokens: the model's tool-call request + any reasoning text
The dominant variable is the tool result content — both because it's
input on the next turn AND because the LLM tends to keep prior snapshots in
context rather than discarding them.
Claude Sonnet 4.6: input $3/MTok, output $15/MTok. Agent loops are
input-heavy (~70/30 split because tool results arrive as input). Blended
≈ $6.60/MTok.
Task
chrome-devtools-mcp
Enquire
agent-browser
Δ vs cdp per 1000 runs
A — tiny extract (measured)
$0.0011
$0.0011
$0.0007
$0.40
B — click-and-read (measured)
$0.017
$0.0021
$0.0010
$15-16
C — heavy flow (extrapolated)
$0.20
$0.0053
$0.0046
$194-196
With prompt caching (90% hit on tool defs + system prompt), absolute deltas
roughly halve, but the ratio holds because cached portions are identical
across MCP stacks — only the tool-result tail differs. agent-browser sidesteps
tool-schema caching entirely (no MCP envelope), so its cached/uncached delta
is smaller.
Vision tokens not counted.take_screenshot adds ~1,500–3,000 tokens
per shot at typical resolution. Enquire doesn't expose screenshots, so it
sidesteps this.
Snapshot size scales with DOM complexity. example.com is a best case
for chrome-devtools-mcp. Twitter, Notion, Gmail — the a11y trees are 5–10×
bigger.
Output-token tally is fuzzy. Agents may re-snapshot more or less based
on model strategy. Sonnet 4.6 in our testing does ~1 snapshot per
interaction; some smaller models do 2–3.
Enquire requires selector knowledge. First run on a new site needs a
discovery pass (use read_page — ~1–3 KB of readable text). After that,
selectors codified in SKILL.md make subsequent runs O(1).
chrome-devtools-mcp wins for performance/perf-debug work. Lighthouse,
network capture, performance traces — Enquire doesn't aim to cover those.
# 1. Bridge + API + WXT dev running (one-time per session)npm run dev:all# 2. agent-browser CLI installed (one-time per machine)npm install -g agent-browserwhich agent-browser # confirm# 3. Sign Enquire extension into the bridge in your main Chrome# (Settings → Cloud → enter dev userid → Sign in)# Verify: curl -s http://localhost:3789/healthz → tunnels: 1
Use your main Chrome (where the Enquire extension is installed and
signed in via the inline form) for Enquire's side, let chrome-devtools-mcp
spawn its own Chromium for the cdp side, and let agent-browser spawn its
own Playwright Chromium for the CLI side. Three browsers, same tasks,
same target page:
--skip-extension-signin tells the benchmark to skip trying to install
the extension into chrome-devtools-mcp's Chrome (which doesn't work —
see below) and use whatever tunnel is already registered with the bridge.
This is the path that produces real successful Enquire numbers.
The benchmark also has a "managed" mode that tries to install the
extension into chrome-devtools-mcp's own Chrome and sign it in
automatically. This was the original intent. But:
chrome-devtools-mcp.install_extension reports success but the
extension's service worker never spins up for unpacked
dist/chrome-mv3-dev builds in our testing — chrome.management-
based install paths assume Chrome Web Store packages.
The --load-extension=<path> chrome flag, passed via
--chromeArg, is also ignored by chrome-devtools-mcp's launch path
in practice (verified by inspecting chrome://version after launch
— only --disable-extensions is in the cmdline, no
--load-extension, even with the wrapper at
scripts/chrome-devtools-mcp-with-extension.sh).
Net: chrome-devtools-mcp doesn't have a clean way to load our unpacked
extension. The split-Chrome approach above is the workaround.
We did ship a BRIDGE_FORCE_RECONNECT runtime message in the background
service worker (entrypoints/background/native-bridge.ts). Benchmark
or debug tooling can call:
…to explicitly tear down + recreate the offscreen WS. Useful when the
offscreen's deferred chrome.storage.onChanged listener hasn't
registered yet when settings change. Goes unused for now because the
auto-setup path is blocked upstream of where this would help, but it
remains a useful primitive for future test harnesses.
chrome-devtools-mcp numbers are measured and accurate — those Chrome
instances run cleanly, all tools fire, byte counts are real.
agent-browser numbers are measured and accurate — its CLI shape
is straightforward to measure (stdout + stderr bytes per command).
Enquire numbers in the auto-setup mode reflect error responses
(~270 B per error: -39001). Successful response bodies are slightly
larger but bounded by what extract / read_page actually return.
The 8.24× click-and-read ratio (cdp vs Enquire) is real even with Enquire bounded
by error bytes — it reflects the cost of take_snapshot (~10 KB/call)
vs targeted extraction. Real Enquire success bodies on example.com
would push that ratio to ~10-15×; on Amazon-class pages, ~25×.
Run with: node scripts/benchmark-mcp.mjs --skip-extension-signin --target <url>.