Enquire docs
    Enquire docs
    Getting startedUser guide
    ArchitectureBrowser toolsProcedures
    Claude Code integrationExtension guide
    DeploymentRoadmapBenchmarks
    Operations

    Roadmap

    What's shipped, what's next, why we don't bundle Chromium.

    May 2026

    This is the strategic doc. It articulates where Enquire sits in the 2026 browser-automation landscape, why we don't drop or shrink scope, and the sequenced work to get to v1.0 ship and beyond.

    For tactical browser-automation comparisons, see Benchmarks.


    The differentiator: user co-presence

    Other tools answer: "how do I drive a browser from an AI agent?" Enquire answers:

    Edit on GitHub

    Deployment

    Self-host bridge, API, and Postgres on Railway; web on Vercel.

    Benchmarks

    Token cost + capability comparison: Enquire vs chrome-devtools-mcp vs agent-browser. Measured on example.com, extrapolated to richer pages.

    On this page

    The differentiator: user co-presenceWhat we already lead onWhat we add (and why we don't drop scope)A. Trim MCP overhead (cost metadata to server)B. Procedure system + marketplaceC. Plan-ahead batch executionD. Missing browser-action coverageE. Visual debuggingWhy we don't bundle ChromiumMulti-session — we punt on itCritical path to v1.0 ship1. Production auth hardening2. Billing and cost reporting3. Procedure sync hardening4. Railway deployment5. Vercel deployment for web6. Chrome Web Store submissionPost-v1.0 expansionSprint 1 — close action coverage (1-2 weeks)Sprint 2 — procedure recording UX (1 week)Sprint 3 — skills marketplace (2-3 weeks)Sprint 4 — plan-ahead batch (1 week)Sprint 5 — polish + observability (1 week)Positioning to use externallyWhat stays out of scope (for now)Open questions worth raising
    "how do I let an AI agent and a user share the same browser session, taking over from each other?"

    That's the pitch. Everything else is supporting infrastructure.

    What it means concretely:

    • Cookies, MFA-completed sessions, autofill, extensions, history, passwords are already there. The agent doesn't sign in; the user did. Agent inherits state.
    • The user can interrupt at any moment. Agent pauses, user clicks, agent resumes from the new state. No fork, no conflict.
    • The agent can hand back when stuck. "I need a 2FA code" — agent tells the user, user enters it in their browser, agent continues.
    • Skills (procedures) are recorded by demonstration. User performs an action once with the agent watching; the agent saves a signed procedure for replay.

    agent-browser, chrome-devtools-mcp, Playwright MCP, Browserbase, Stagehand, OpenAI Atlas — none of these support this co-presence model. They all assume agent OR user, not agent AND user, switching seats.

    This is the moat. The roadmap protects and amplifies it.


    What we already lead on

    Verified via measurements + capability audit:

    Enquireagent-browserchrome-devtools-mcp
    User's logged-in browser✓✗✗
    Latency per call<100 ms~360 ms (subprocess spawn)~150 ms
    Procedure recording with cryptographic signing✓ (Ed25519)✗✗
    MCP-native discovery✓✗ (skill-based)✓
    Multi-LLM provider built-in✓ (5 providers + Ollama)n/an/a
    Tier-routed LLM (cheap/local for procedures, capable for exploration)✓n/an/a
    Token efficiency vs snapshot-y MCP8.24× cheaper than cdp (measured)17× cheaperbaseline
    Auth abstraction (@enqr/auth-core)✓n/an/a
    Cost telemetry per call (managed billing)✓✗✗

    Latency is the underrated win. Enquire's HTTP+WS path is already 3-4× faster than CLI subprocess spawning. For chat UIs that's the difference between fluid and laggy.


    What we add (and why we don't drop scope)

    The rule: a browser agent has to cover what users actually do. If we shrink scope, we cede the use case to whoever covers more. Better to expand carefully and let procedures + marketplace amplify what we have.

    A. Trim MCP overhead (cost metadata to server)

    Problem: Every Enquire MCP response carries a cost block (~100-150 B):

    {
      "result": {
        "content": [...],
        "cost": {"mode": "byo", "model": "claude-sonnet-4-6", "tokens_in": 0, ... }
      }
    }

    That's pure overhead from the LLM agent's perspective. The agent doesn't act on it; it just reads it as input on the next turn.

    Fix:

    1. Bridge logs cost server-side via fire-and-forget POST to @enqr/api's /billing/report endpoint (already partially wired per the bridge audit). Wire the rest.
    2. Strip cost from MCP tool responses in production mode.
    3. Web dashboard /app/cost page already shows aggregated cost from server-side logs (already shipped).
    4. Keep cost in dev mode behind an env flag for debugging.

    Files:

    • packages/mcp-browser-bridge/src/v1/mcp-server.ts:130-151 — conditionally omit cost from response payload
    • packages/mcp-browser-bridge/src/v1/billing.ts:86-123 — already has fire-and-forget reporting; verify wiring
    • packages/api/src/routes/billing.ts (or equivalent) — receive + store cost reports per-user-per-call

    Saves: ~100 B × N tool calls per agent loop. On a 20-step flow, that's ~500 tokens reclaimed.

    B. Procedure system + marketplace

    Status: core infra already shipped (lib/skills/):

    • executor.ts — runs procedures with validate predicates, fallbacks
    • schema.ts — YAML frontmatter, step types
    • signing.ts — Ed25519 verification

    What's missing:

    1. Recording UI. Extension records the user's actions with the agent watching. Output: a draft SKILL.md blob. (Files: new entrypoints/sidepanel/RecordPanel.tsx, hook into lib/skills/.)

    2. Library browser. "Procedures" page in options/sidepanel showing local + cloud procedures, organized by domain. Already a route in the web dashboard at /app/procedures — need to mirror it in the extension.

    3. Marketplace catalog. Public registry of community-contributed skills, indexed by domain (google.com/sheets, github.com/PR, notion.so/database). User installs a procedure → it's verified via Ed25519 signature, cached locally.

    4. Submission flow. Author signs their procedure, submits via npx @enqr/skills publish ./my-skill.md. Server stores + indexes.

    5. Discovery. Extension sees cnn.com open → auto-suggests "5 community skills available for cnn.com" → user picks one.

    6. Starter pack. Pre-built common procedures shipped with v1:

      • Google Sheets: fill-cells, append-row, format-range
      • Gmail: send-with-template, archive-by-search
      • GitHub: review-PR, fork-and-clone, file-issue
      • Notion: append-to-database-row, create-doc
      • LinkedIn: send-connection, message-template
      • Calendar: schedule-event, find-free-time

    This is the marketplace. It's the long-term moat: as the library grows, Enquire becomes the default way to "do X on Y site" because the recipe already exists.

    Files (new):

    • packages/skills-registry/ — server for catalog + submission API
    • lib/skills/registry-client.ts — extension-side fetch + cache
    • entrypoints/sidepanel/SkillBrowser.tsx — user-facing browse + install
    • entrypoints/sidepanel/SkillRecorder.tsx — record-by-demonstration

    C. Plan-ahead batch execution

    The insight: "it can go on doing cli approach and come back with the result" — user wants Enquire to support batch execution where the agent plans N steps ahead and the bridge runs them as one call, returning only the final state.

    This already kind of exists — that's what procedures do. Generalize it:

    1. New MCP tool: execute_plan(steps: ToolCall[]). Single MCP call that takes an array of [tool, args] and runs them in sequence. Returns final tool result + summary of intermediate results.

    2. New MCP tool: run_procedure(name, args). Looks up a signed procedure by name, executes it. (Already in skills executor; expose as MCP tool.)

    3. Why this matters for tokens: instead of N agent turns each with their own tool result baked into context, the agent issues one execute_plan and gets one consolidated result. Reduces round-tripping.

    Files:

    • packages/mcp-browser-bridge/src/v1/mcp-server.ts — register execute_plan, run_procedure as new tools
    • Bridge → tunnel relay carries the plan to the extension
    • Extension's lib/skills/executor.ts runs the plan locally (already exists for procedures)

    D. Missing browser-action coverage

    Audit gap from agent-browser comparison. None of these are big lifts; all are within the WXT extension's reach:

    ActionStatusFiles to add
    Screenshot (visible viewport + full page)Missing in v1 MCP surfacelib/builtin-tools/screenshot.ts (already exists; just expose via bridge)
    Semantic locators (find_by_role, find_by_text, find_by_label)MissingNew tools in lib/builtin-tools/semantic-locator.ts
    File uploadMissinglib/builtin-tools/upload.ts — wraps chrome.scripting + <input type=file>
    Drag-and-dropMissinglib/builtin-tools/drag.ts — CDP Input.dispatchMouseEvent chain
    Frame switchingMissing for nested iframeslib/builtin-tools/frame.ts
    PDF exportMissinglib/builtin-tools/pdf.ts — CDP Page.printToPDF
    Mobile emulationMissinglib/builtin-tools/emulate.ts — CDP Emulation.setDeviceMetricsOverride (already partial)
    Cookie/storage CRUDIn-extension onlyExpose via MCP tools
    Performance tracesMissinglib/builtin-tools/perf.ts — CDP Tracing.start/end
    Network mockingMissinglib/builtin-tools/network-mock.ts — CDP Fetch.enable + requestPaused

    These are tools, not new infrastructure. Each is ~50-200 LOC against existing CDP plumbing in lib/builtin-tools/cdp-manager.ts.

    E. Visual debugging

    agent-browser has --headed (already moot for us — the user's Chrome IS the visible browser) and highlight @e1 (visually highlight an element). The latter is a nice add:

    New tool: highlight(selector, durationMs) — renders a short-lived overlay on the matched element. Useful for "agent is working on this" UX feedback.

    Files: lib/builtin-tools/highlight.ts — chrome.scripting inject a colored outline + auto-remove.


    Why we don't bundle Chromium

    agent-browser ships with its own Chromium because it's a developer tool. It needs to work without depending on user state.

    If Enquire bundled Chromium, we'd lose:

    • The user's logged-in sessions
    • The user's cookies, history, autofill
    • The user's other extensions
    • The user's local Chrome profile and bookmarks
    • The "user takes over" handoff (because the user would be in their Chrome, agent would be in bundled Chrome)

    That's the entire product gone.

    What we could ship: a separate enqr-headless SKU (Tauri or Electron + bundled Chromium) for CI/server use cases where there is no user. That's a different product. Not v1, maybe v2 or later. The core "Enquire" product stays as-is.

    For version mismatch in the user's Chrome — we can mitigate by:

    1. Pinning manifest_version + tested Chrome versions
    2. Auto-suggesting Chrome update when extension detects an unsupported version
    3. CI matrix testing across stable/beta/canary Chrome

    Multi-session — we punt on it

    agent-browser has --session a / --session b for parallel browsers. We have multi-tab. Trade-off:

    • agent-browser's parallel sessions are isolated profiles — useful when scraping different sites with different login state
    • Enquire's multi-tab is one profile — same login state, but multiple tabs/origins concurrent

    Decision: keep multi-tab, defer multi-session. Multi-tab covers ~95% of real user workflows (most users don't need 5 different Twitter accounts open). If we need multi-session later, we can build it as separate Chrome profiles + separate extension instances — but not v1 priority.


    Critical path to v1.0 ship

    In dependency order. Each step is unblocking for the next:

    1. Production auth hardening

    Better Auth is wired through @enqr/auth-core for the API and bridge. Remaining work is operational: production secret rotation, trusted-origin review, and extension sign-in smoke tests across signed beta builds.

    2. Billing and cost reporting

    Stripe webhooks and weekly ceiling surfaces exist in the API. The bridge strips inline cost by default and reports usage server-side. Remaining work is reconciliation/alerting and validating that dashboard cost views match bridge reports under real managed-mode traffic.

    3. Procedure sync hardening

    The extension has local procedures, captured traces, and MCP CRUD/run tools. The dashboard has cloud-visible procedure records and edit/history/delete controls. The remaining product slice is reliable extension-to-cloud sync with conflict handling that does not surprise local-first users.

    4. Railway deployment

    Files: railway.toml for both @enqr/bridge and @enqr/api services. Postgres provisioning. Env vars set. Domain + DNS config for bridge.enqr.dev + api.enqr.dev.

    5. Vercel deployment for web

    enquire-web deploys as the web/dashboard repo. Env vars are set from the Vercel dashboard. Domains: enqr.dev, www.enqr.dev, and app.enqr.dev.

    6. Chrome Web Store submission

    wxt build --mode store produces store-ready bundle (already exists). Need: privacy policy, screenshots, store listing copy, review turnaround. ~5-7 day Google review.

    Until review completes, install docs should use signed beta and load-unpacked instructions instead of a placeholder store link.


    Post-v1.0 expansion

    In rough priority order:

    Sprint 1 — close action coverage (1-2 weeks)

    Section D's missing tools: screenshot, semantic locators, file upload, drag, frames, PDF, perf, network-mock. Each independently shippable. Most-used first: screenshot, semantic locators, file upload.

    Sprint 2 — procedure recording UX (1 week)

    Section B(1)+(2). Extension UI to record-by-demonstration and browse local procedures. Server-side cloud sync already wired.

    Sprint 3 — skills marketplace (2-3 weeks)

    Section B(3-6). New packages/skills-registry/ package, web UI for browsing, submission CLI, signature verification, starter pack of ~30 common procedures across Google/GitHub/Notion/Slack.

    Sprint 4 — plan-ahead batch (1 week)

    Section C. New MCP tools execute_plan + run_procedure. Documented patterns for agents to use these for multi-step flows.

    Sprint 5 — polish + observability (1 week)

    • Prometheus metrics on bridge
    • Better error codes + user-facing messages
    • Highlight tool for visual feedback
    • Audit log UI on web

    Positioning to use externally

    One-liner for marketing/landing:

    Enquire — the AI agent that lives in your browser, with your accounts, that you can take over from at any moment.

    Three-line pitch:

    Other AI browser tools spawn a clean Chromium and start from zero. Enquire runs in your Chrome, with your cookies, your sessions, your 2FA tokens. The agent picks up where you left off. You can pause it mid-flow, click around yourself, hand it back. Procedures it learns from you stay yours, signed, replayable.

    Comparison table for landing page (one screen):

    EnquireAtlasagent-browserPlaywright MCP
    Uses your Chrome + accounts✓✓✗✗
    Lets you take over mid-flow✓✗✗✗
    External agents can drive it (MCP)✓✗(CLI not MCP)✓
    Records procedures from your demos✓✗✗✗
    Marketplace of community skillscoming✗✗✗
    Open source✓✗✓ (Apache)✓

    The "lets you take over mid-flow" row is the unique cell. That's the moat.


    What stays out of scope (for now)

    To keep the v1 ship date realistic:

    • Mobile (iOS/Android) — far future
    • Standalone browser (Atlas-style) — separate product if ever
    • Headless CI mode — separate product, see "Why we don't bundle Chromium"
    • Multi-account / multi-session — not v1
    • Workflow visual editor (drag-and-drop boxes) — procedures are YAML for now
    • Native messaging host (Tauri menu-bar) — was Phase 5 of old plan, still deferred

    These are not no's. They are not-yets.


    Open questions worth raising

    1. Pricing. BYO is free; Managed has a weekly cost ceiling. What are the right defaults? ($5/week starter, $20/week pro, $100/week team?) Need real dogfood data.

    2. Marketplace economics. Free skills only? Paid skills with revenue share? Premium skills (e.g. "Salesforce report generator")? Defer the answer; build the infra to support either.

    3. Privacy stance. Enquire sees the user's whole browser. We need a clear privacy policy: what gets sent to the bridge, what stays local, what's encrypted, retention. Already a draft in setup page; needs legal review pre-launch.

    4. Browser support. Chrome only for v1. Firefox + Edge later via WXT's multi-browser support — most code is portable. Cost: testing matrix.