What's shipped, what's next, why we don't bundle Chromium.
May 2026
This is the strategic doc. It articulates where Enquire sits in the
2026 browser-automation landscape, why we don't drop or shrink scope,
and the sequenced work to get to v1.0 ship and beyond.
For tactical browser-automation comparisons, see Benchmarks.
"how do I let an AI agent and a user share the same
browser session, taking over from each other?"
That's the pitch. Everything else is supporting infrastructure.
What it means concretely:
Cookies, MFA-completed sessions, autofill, extensions, history,
passwords are already there. The agent doesn't sign in; the user
did. Agent inherits state.
The user can interrupt at any moment. Agent pauses, user clicks,
agent resumes from the new state. No fork, no conflict.
The agent can hand back when stuck. "I need a 2FA code" — agent
tells the user, user enters it in their browser, agent continues.
Skills (procedures) are recorded by demonstration. User performs
an action once with the agent watching; the agent saves a signed
procedure for replay.
agent-browser, chrome-devtools-mcp, Playwright MCP, Browserbase,
Stagehand, OpenAI Atlas — none of these support this co-presence
model. They all assume agent OR user, not agent AND user, switching
seats.
This is the moat. The roadmap protects and amplifies it.
Tier-routed LLM (cheap/local for procedures, capable for exploration)
✓
n/a
n/a
Token efficiency vs snapshot-y MCP
8.24× cheaper than cdp (measured)
17× cheaper
baseline
Auth abstraction (@enqr/auth-core)
✓
n/a
n/a
Cost telemetry per call (managed billing)
✓
✗
✗
Latency is the underrated win. Enquire's HTTP+WS path is already 3-4×
faster than CLI subprocess spawning. For chat UIs that's the
difference between fluid and laggy.
The rule: a browser agent has to cover what users actually do. If we
shrink scope, we cede the use case to whoever covers more. Better to
expand carefully and let procedures + marketplace amplify what we have.
That's pure overhead from the LLM agent's perspective. The agent
doesn't act on it; it just reads it as input on the next turn.
Fix:
Bridge logs cost server-side via fire-and-forget POST to
@enqr/api's /billing/report endpoint (already partially wired
per the bridge audit). Wire the rest.
Strip cost from MCP tool responses in production mode.
Web dashboard /app/cost page already shows aggregated cost from
server-side logs (already shipped).
Keep cost in dev mode behind an env flag for debugging.
Files:
packages/mcp-browser-bridge/src/v1/mcp-server.ts:130-151 —
conditionally omit cost from response payload
packages/mcp-browser-bridge/src/v1/billing.ts:86-123 — already
has fire-and-forget reporting; verify wiring
packages/api/src/routes/billing.ts (or equivalent) — receive +
store cost reports per-user-per-call
Saves: ~100 B × N tool calls per agent loop. On a 20-step flow,
that's ~500 tokens reclaimed.
executor.ts — runs procedures with validate predicates, fallbacks
schema.ts — YAML frontmatter, step types
signing.ts — Ed25519 verification
What's missing:
Recording UI. Extension records the user's actions with the
agent watching. Output: a draft SKILL.md blob. (Files: new
entrypoints/sidepanel/RecordPanel.tsx, hook into lib/skills/.)
Library browser. "Procedures" page in options/sidepanel
showing local + cloud procedures, organized by domain. Already a
route in the web dashboard at /app/procedures — need to mirror
it in the extension.
Marketplace catalog. Public registry of community-contributed
skills, indexed by domain (google.com/sheets, github.com/PR,
notion.so/database). User installs a procedure → it's verified
via Ed25519 signature, cached locally.
Submission flow. Author signs their procedure, submits via
npx @enqr/skills publish ./my-skill.md. Server stores +
indexes.
Discovery. Extension sees cnn.com open → auto-suggests
"5 community skills available for cnn.com" → user picks one.
Starter pack. Pre-built common procedures shipped with v1:
Google Sheets: fill-cells, append-row, format-range
Gmail: send-with-template, archive-by-search
GitHub: review-PR, fork-and-clone, file-issue
Notion: append-to-database-row, create-doc
LinkedIn: send-connection, message-template
Calendar: schedule-event, find-free-time
This is the marketplace. It's the long-term moat: as the library
grows, Enquire becomes the default way to "do X on Y site"
because the recipe already exists.
Files (new):
packages/skills-registry/ — server for catalog + submission API
The insight: "it can go on doing cli approach and come back with
the result" — user wants Enquire to support batch execution where the
agent plans N steps ahead and the bridge runs them as one call,
returning only the final state.
This already kind of exists — that's what procedures do. Generalize
it:
New MCP tool: execute_plan(steps: ToolCall[]). Single MCP
call that takes an array of [tool, args] and runs them in
sequence. Returns final tool result + summary of intermediate
results.
New MCP tool: run_procedure(name, args). Looks up a
signed procedure by name, executes it. (Already in skills
executor; expose as MCP tool.)
Why this matters for tokens: instead of N agent turns each
with their own tool result baked into context, the agent issues
one execute_plan and gets one consolidated result. Reduces
round-tripping.
Files:
packages/mcp-browser-bridge/src/v1/mcp-server.ts — register
execute_plan, run_procedure as new tools
Bridge → tunnel relay carries the plan to the extension
Extension's lib/skills/executor.ts runs the plan locally (already
exists for procedures)
agent-browser has --headed (already moot for us — the user's Chrome
IS the visible browser) and highlight @e1 (visually highlight an
element). The latter is a nice add:
New tool: highlight(selector, durationMs) — renders a
short-lived overlay on the matched element. Useful for "agent is
working on this" UX feedback.
Files:lib/builtin-tools/highlight.ts — chrome.scripting
inject a colored outline + auto-remove.
agent-browser ships with its own Chromium because it's a developer
tool. It needs to work without depending on user state.
If Enquire bundled Chromium, we'd lose:
The user's logged-in sessions
The user's cookies, history, autofill
The user's other extensions
The user's local Chrome profile and bookmarks
The "user takes over" handoff (because the user would be in
their Chrome, agent would be in bundled Chrome)
That's the entire product gone.
What we could ship: a separate enqr-headless SKU (Tauri or
Electron + bundled Chromium) for CI/server use cases where there is no
user. That's a different product. Not v1, maybe v2 or later. The core
"Enquire" product stays as-is.
For version mismatch in the user's Chrome — we can mitigate by:
Pinning manifest_version + tested Chrome versions
Auto-suggesting Chrome update when extension detects an unsupported
version
CI matrix testing across stable/beta/canary Chrome
agent-browser has --session a / --session b for parallel browsers.
We have multi-tab. Trade-off:
agent-browser's parallel sessions are isolated profiles — useful
when scraping different sites with different login state
Enquire's multi-tab is one profile — same login state, but multiple
tabs/origins concurrent
Decision: keep multi-tab, defer multi-session. Multi-tab covers
~95% of real user workflows (most users don't need 5 different
Twitter accounts open). If we need multi-session later, we can build
it as separate Chrome profiles + separate extension instances — but
not v1 priority.
Better Auth is wired through @enqr/auth-core for the API and bridge.
Remaining work is operational: production secret rotation, trusted-origin
review, and extension sign-in smoke tests across signed beta builds.
Stripe webhooks and weekly ceiling surfaces exist in the API. The bridge
strips inline cost by default and reports usage server-side. Remaining
work is reconciliation/alerting and validating that dashboard cost views
match bridge reports under real managed-mode traffic.
The extension has local procedures, captured traces, and MCP CRUD/run tools.
The dashboard has cloud-visible procedure records and edit/history/delete
controls. The remaining product slice is reliable extension-to-cloud sync
with conflict handling that does not surprise local-first users.
Files: railway.toml for both @enqr/bridge and @enqr/api services.
Postgres provisioning. Env vars set. Domain + DNS config for
bridge.enqr.dev + api.enqr.dev.
Section B(3-6). New packages/skills-registry/ package, web UI for
browsing, submission CLI, signature verification, starter pack of
~30 common procedures across Google/GitHub/Notion/Slack.
Enquire — the AI agent that lives in your browser, with your
accounts, that you can take over from at any moment.
Three-line pitch:
Other AI browser tools spawn a clean Chromium and start from zero.
Enquire runs in your Chrome, with your cookies, your sessions, your
2FA tokens. The agent picks up where you left off. You can pause it
mid-flow, click around yourself, hand it back. Procedures it learns
from you stay yours, signed, replayable.
Comparison table for landing page (one screen):
Enquire
Atlas
agent-browser
Playwright MCP
Uses your Chrome + accounts
✓
✓
✗
✗
Lets you take over mid-flow
✓
✗
✗
✗
External agents can drive it (MCP)
✓
✗
(CLI not MCP)
✓
Records procedures from your demos
✓
✗
✗
✗
Marketplace of community skills
coming
✗
✗
✗
Open source
✓
✗
✓ (Apache)
✓
The "lets you take over mid-flow" row is the unique cell. That's the
moat.
Pricing. BYO is free; Managed has a weekly cost ceiling. What
are the right defaults? ($5/week starter, $20/week pro, $100/week
team?) Need real dogfood data.
Marketplace economics. Free skills only? Paid skills with
revenue share? Premium skills (e.g. "Salesforce report
generator")? Defer the answer; build the infra to support either.
Privacy stance. Enquire sees the user's whole browser. We need
a clear privacy policy: what gets sent to the bridge, what stays
local, what's encrypted, retention. Already a draft in setup page;
needs legal review pre-launch.
Browser support. Chrome only for v1. Firefox + Edge later via
WXT's multi-browser support — most code is portable. Cost: testing
matrix.