The web in the age of AI
We used to browse by opening pages and stitching answers together. Now we ask, and an answer arrives—often with a concrete plan attached. The unit of experience shifts from navigation to intent. The old stack remains, but the principles that made it robust—declarative descriptions, careful reconciliation, and explicit effects—matter even more.
Updated 17 Sep 2025
From DOM trees to intent surfaces
The web’s execution pipeline has stayed constant for decades: bytes arrive, parsers tokenize, trees form, layout and paint follow, and the result becomes interactive through events and state. What changed is the front door. People increasingly begin with outcomes rather than destinations, and expect the system to assemble the path from their own data, tools, and history without asking for manual navigation.
Interfaces once optimized for wayfinding—menus, breadcrumbs, secondary sidebars—now share space with answer surfaces that capture intent in natural language and return structured plans. The goal is not to hide the web’s mechanics; it is to present a better abstraction. We collect what the visitor wants, map it to a constrained schema, and expose a plan that a human can inspect and a machine can execute without confusion.
This shift rewards teams that already think declaratively. In the DOM, you specify the state and rely on a diff to compute minimal edits. Intent surfaces follow the same philosophy: specify the goal, compute the fewest verified steps that reach it, and keep the plan legible. The primitives are familiar—data, controls, side‑effects—only the author of the first draft has changed from designer to assistant.
A practical consequence is tighter coupling between copy, controls, and capabilities. The words that describe an action must match a tool that can perform it and a policy that allows it. If the interface suggests a booking, the platform must define the parameters, validation rules, and checkpoints. This alignment is the backbone of modern answer surfaces and the reason ambiguous wizards are being retired.
None of this removes links, search, or classic navigation. It complements them. Some journeys still require exploration and comparison. Intent surfaces absorb the repetitive, well‑structured tasks where a plan can be synthesized, reviewed, and executed safely. The craft lies in knowing which paths benefit from explanation and which deserve direct execution with guardrails, receipts, and a clear undo.
Why React’s mental model still matters
React taught teams to separate inputs from memory and view rendering as a pure function of state. Even if a model drafts components, that output still needs a predictable runtime to schedule work, batch updates, and clean up effects. Hooks, concurrent rendering, and suspense were designed for precisely that: making side‑effects explicit and letting the scheduler prioritize what feels most urgent to the user.
Assistants intensify these requirements. A generated component can be fluent but must remain composable and safe. The reconciliation step prevents wasteful DOM churn; the state model keeps accidental global variables from accumulating; the rules around effects make mutations visible and auditable. Without those constraints, assistant‑ drafted interfaces become fragile text dumps rather than stable programs.
The deeper lesson is that declarations scale better than imperative histories. When a user edits a plan, we calculate the delta and apply it, rather than replaying every intermediate intent. That mirrors how React compares trees. The diff is not an optimization detail; it is the foundation for latency hiding, cancellation, and the feeling of responsiveness that users notice within the first seconds of contact.
Teams that align their data and component boundaries with intent boundaries move faster. A distinct component should own the confirmation step; a separate module should own tool invocation; the rest of the tree should remain pure and re‑render cheaply. That partitioning enables testing at the seams: we can render the plan view without hitting external systems, while still verifying that a confirmed plan calls the right tools with specific arguments.
Ultimately, assistants do not replace the React mental model—they rely on it. The model gives us vocabulary to express where data comes from, how it flows, and which places are allowed to cause change. Answer surfaces extend that vocabulary to intent and plans, but the underlying grammar remains familiar to any engineer who has learned to separate props, state, and effects into testable, composable parts.
Interfaces that listen to intent
Classical forms ask the visitor to translate goals into fields and choices. Answer surfaces reverse that burden: the system listens first, proposes a structured interpretation, and asks for confirmation before doing anything with external effects. The difference is not only ergonomic; it reduces error by keeping the human in the loop at the moment where ambiguities become visible and easy to correct.
I can book this for you. Here’s what I understood:
Forms collect inputs; intent extracts them, then asks for confirmation.
An intent surface becomes robust when it pairs language with a schema. The schema declares required properties, allowable ranges, defaults, and the shape of justifications. By binding free‑form input to typed data, we gain a shared representation that both UI components and agents can use. The result is not magic; it is disciplined data modeling applied at the front of the experience rather than only at the backend.
Confirmation is the second pillar. Instead of sending a sequence of opaque calls, we render a plan a person can read in plain language, along with the structured payload that tools will receive. That recap lowers the cognitive load on complex decisions and lowers the risk of unintended actions. It also generates an artifact that is trivial to log, search, and review later when questions arise.
The third pillar is editability. An answer surface should make it effortless to adjust constraints—budgets, dates, thresholds—without re‑explaining the entire task. Edits should update the structured payload directly, re‑run the planning step, and reflect the result instantly. People feel in control when they can steer a plan rather than restart a conversation for every minor change in preference or context.
Finally, the intent layer should degrade gracefully. When tool permissions are missing or context is insufficient, we fall back to citations, comparison tables, or curated links rather than failing silently. The goal is not to replace reading; it is to remove drudgery. When the answer is a plan, show it; when the answer is a source, present it with enough structure that the next step remains obvious.
Bridging UI and agents
As interfaces become agentic, UI work bleeds into specification work. The same copy that guides a human also guides a tool‑calling agent, and the same constraints that keep a plan safe for a person keep it safe for an API. Bringing both worlds together requires shared contracts: schemas for inputs, capabilities for tools, and a trace that binds the description to the actions that followed.
We treat the manifest that powers answer surfaces as a first‑class artifact. It lists intents, parameters, defaults, validation rules, and the scopes an agent must hold to execute them. On the other side, our tool servers expose well‑typed functions with telemetry baked in. When the manifest and tools share names and shapes, plans become portable between UI and automation without translation layers that drift over time.
The most effective teams keep a human‑legible description and a machine‑legible schema in the same directory and the same revision control story. A designer can change copy in the manifest and a platform engineer can evolve an input type in the schema, but both changes ship together. That practice removes an entire class of mismatches that cause late‑night rollbacks and confused users.
Bridging also means accountability. Every plan should carry a unique identifier that threads through UI, logs, and external side‑effects. When someone asks, “why did the system do this?”, we can show the intent, the confirmation, the tool calls, and their results as a single packet. That level of traceability turns assistant features from demos into production systems that satisfy operational and legal scrutiny.
Finally, avoid conflating generation with execution. Plans are proposals; tools are capabilities. Keeping them separate makes it easier to re‑run a plan with different constraints, test it in sandboxes, and reason about security. A clean handoff point—often a signed JSON document—keeps the boundary bright and the responsibilities clear.
Playbooks for shipping this future
Start with a small set of intents attached to clear business outcomes. Instrument them from day one: track how often people accept plans, where they edit, and which steps generate errors. Observability turns anecdotes into evidence, and evidence makes prioritization easier when the number of possible surfaces exceeds the capacity of any single team.
Build a shared library for schema definitions and confirmation views, then use it everywhere. When product lines share the same grammar for plans—names, parameters, constraints—new features become a matter of filling gaps rather than inventing formats. A small, boring vocabulary accelerates work because every stakeholder knows what a given field means across products and contexts.
Establish an approval workflow for tool scopes. Read‑only tools ship by default; write tools require an explicit review by the group that owns the underlying system. The goal is not red tape, but clarity: every irreversible action should have a named owner, a rollback playbook, and a rate limit that prevents cascading failures across tenants.
Treat latency as a feature. Stream partial results to keep people oriented, reveal the shape of the plan before execution, and show progress on long‑running steps. If the system cannot act yet, it should still explain what it has learned and which information would unlock the next step. Perceived speed is as important as absolute speed.
Finally, keep the wall between generation and effect. Test plan generation in isolation with fixtures. Test tool servers with contract tests against sandboxes. Then test the handshake. When each piece can prove its behavior, integration becomes about wiring rather than guesswork, and incidents become rare and explainable rather than mysterious and irreproducible.
Objections & replies
“Isn’t this just chat everywhere?” Conversation is an input method, not the product. The product is a contract that pairs language with typed parameters and explicit confirmation. In places where free‑form dialog hides complexity, we prefer forms and tables. In places where the intent is obvious, we prefer buttons. Answer surfaces choose what fits the job instead of forcing a single interaction style.
“Streaming feels gimmicky.” Streaming is a design tool for legible latency. It lets a system express partial certainty, show progress, and keep people oriented. When the first token appears quickly, people remain engaged even if the full plan takes seconds to compute. The alternative—blank screens—encourages abandonment and reduces trust in features that are perfectly capable but poorly staged.
“What about safety?” We separate capabilities into read and write tools, validate every payload against schemas, and require explicit confirmation for anything that crosses system boundaries. We also log every plan, tool call, and result with identifiers that connect to customer support and billing. Safety is not a toggle; it is a set of habits that make automation accountable to people.
“Won’t this remove exploration?” Answer surfaces are not a substitute for reading, browsing, or following curiosity. They remove friction where the goal is already known and repetitive. When the goal is learning, we present sources, comparisons, and trails. The skill is offering the right doorway for the moment—sometimes a plan, sometimes a set of perspectives—and switching gracefully when intent changes.
“Isn’t the cost too high?” The cost of scattered flows and rework is higher. Shared schemas, confirmation views, and tool contracts reduce duplication across teams. Once those pieces exist, new surfaces reuse infrastructure instead of inventing it. The first plan costs time; the tenth becomes routine; the hundredth becomes the platform’s default way to ship features that touch external systems.
Intent schemas and confirmations
Schemas turn intuition into contracts. They specify which fields a plan needs, which values are allowed, and which combinations must be rejected. By publishing schemas, we enable static analysis, better error messages, and safer defaults. People still write in natural language, but the system translates that language into data that code and auditors can reason about without guesswork.
A confirmation view is a living spec. It shows the proposed action in plain language, the structured payload beneath it, and the tool or endpoint that will receive it. Good confirmation views highlight assumptions, surface costs, and present alternatives when inputs look risky. The goal is to give people agency before effects occur, not excuses after the fact.
The craft sits in defaults. Most plans can inherit sane values from prior choices, account settings, or environment capabilities. Defaulting smartly reduces friction without erasing consent. When a default matters, the view should explain it and provide controls to revise it. Defaults are not secrets; they are design decisions made visible.
Versioning matters. Schemas and confirmation templates should carry versions tied to deployments. That allows us to re‑render historical plans exactly as they appeared at execution time and to reason about behavior across releases. A small, predictable versioning policy prevents a whole class of “works on my machine” failures that erode trust in assistant features.
Finally, treat plan generation as deterministic for the same inputs. Stochastic models can propose alternatives, but once a person confirms, the serialized plan becomes the source of truth. Determinism at the confirmation boundary is what makes replays, audits, and refunds possible when something goes wrong far from the interface that initiated the change.
Streaming and legible latency
People perceive delays differently depending on what the interface reveals. A fast first token acknowledges the request and reduces anxiety. A visible plan sketch explains why a decision takes time. Progress indicators for tool calls show that work is occurring even when there is nothing to render yet. Each technique buys attention with honesty rather than placeholder animations that communicate nothing.
Streaming also changes error handling. When a step fails mid‑flight, the system can propose alternatives immediately instead of waiting for a full pipeline to finish. That responsiveness preserves context and keeps people willing to iterate. It also encourages smaller, composable steps because recovery is easier when failures are localized and visible rather than buried deep inside a monolithic request.
Budget latency the way you budget bytes. Track time to first token, time to plan, and time to useful. Optimize for the measure that drives satisfaction in your domain. In many cases, the ability to act on partial information beats a perfect answer that arrives after the user has already chosen a different path or given up entirely.
Design for cancellation. As people type, scroll, or change constraints, work in flight should stop cleanly. That is as much an API discipline as a UI concern. Cooperative cancellation prevents stale updates, reduces load, and avoids confusing flashes where the interface oscillates between old and new plans without a clear transition.
Finally, make streamed structure first‑class. Do not stream only prose; stream the evolving payload too. When the system learns a date, destination, or budget, reflect it immediately in the structured plan. That feedback loop makes people feel understood and lets other components react to partial certainty without waiting for a final paragraph to arrive.
Safety, permissions, and audits
The simplest safety pattern is separation: read tools and write tools live under different scopes, and plans declare which scopes they require before execution. Permission prompts should be explicit, revocable, and tied to clearly named capabilities. People forgive delays when security is legible; they do not forgive surprises that mutate data without a visible handshake.
Validation belongs at the boundary. Every tool should check inputs against schema and environment constraints before attempting work. That is not mistrust of the UI; it is defense in depth. When validations fail, errors should carry advice about how to revise the plan rather than generic stack traces that only engineers can interpret.
Logging is a feature, not an afterthought. A good log includes the human intent, the plan version, the tool calls, and the outcomes with timestamps. With that packet, support can help, finance can reconcile, and legal can audit claims. Without it, the system devolves into folklore and screenshots. Deterministic, portable logs are the price of operating automation in public.
Rate limits and circuit breakers keep misconfigurations from turning into incidents. Plans that cause unusually high write volume should trigger reviews or staged rollouts. Tools that fail repeatedly should back off and alert owners. These controls sound operational, but they are part of interface design because they shape what promises we can make to users in good faith.
Finally, align safety with dignity. Clear confirmations, easy undo, and transparent ownership are forms of respect as much as controls. They communicate that the system values the person’s time, data, and attention. In the long run, those cues compound into trust, and trust is the scarce resource that determines whether new interaction patterns spread beyond demos.
Observability and improvement loops
Answer surfaces make it easy to measure success because the unit of work is a plan with a clear outcome. Track acceptance, edits, cancellations, and retries. Tag plans with cohorts so you can see which phrasing, defaults, or layouts increase clarity. Treat observation as part of design work, not a separate discipline, so improvements arrive in weeks rather than quarters.
Feedback should live next to decisions. If a plan led to a bad result, the person should be able to report it from the receipt, not a generic form. Receipts unify the narrative, the payload, and the effects, making triage quick and targeted. A good receipt becomes a living document that engineering, support, and the user can all reference.
Improvement loops benefit from small, safe experiments. Toggle copy, defaults, or tool versions behind flags, then examine plan metrics rather than only click‑through rates. Because plans carry structure, you can compare outcomes precisely: which budgets tend to fail, which destinations need extra data, which validators generate the most edits before acceptance.
Observability also helps internal teams. When engineers can re‑run plans in a sandbox with synthetic data, they can reproduce bugs, test fixes, and share minimal examples without coordinating across schedules. That capability turns a fragile feature into a stable service because it makes learning cheap and collaboration routine instead of heroic.
Finally, close the loop with documentation that mirrors reality. The canonical place for schemas, manifests, and tool contracts should be the same repository that ships the interface. Inline examples should be copied from real receipts and kept short. When docs reflect the working system, new contributors can change it safely, and users can trust that the patterns they see are the patterns they will get.