Skip to main content

AG-UI: The Missing Protocol Between AI Agents and Your Frontend

· 4 min read
Gergely Sipos
Frontend Architect

AG-UI (Agent User Interaction Protocol) is an open, event-based protocol that standardizes how AI agents communicate with user-facing applications. Created by CopilotKit and adopted by Microsoft, Google, AWS, LangChain, and CrewAI, it fills the gap between MCP (agent↔tools) and A2A (agent↔agent) by defining the agent↔UI layer. It's MIT licensed with 14.4k GitHub stars and 50+ framework integrations.

The Problem

Traditional REST and GraphQL APIs assume request-response cycles with deterministic outputs. AI agents break all of those assumptions:

  • Long-running — agents stream intermediate work over seconds or minutes, not milliseconds.
  • Non-deterministic — an agent may produce different UI artifacts on each run, making static schemas fragile.
  • Mixed output — a single agent turn can emit plain text, structured tool calls, state mutations, and reasoning traces.
  • Human-in-the-loop — agents need to pause, ask for confirmation, and resume based on user decisions.

You can hack around these with ad-hoc WebSocket messages, but then every agent framework invents its own wire format and every frontend rebuilds the same parsing logic. AG-UI standardizes that layer.

How It Works: Events All the Way Down

The protocol models all agent→client communication as a typed event stream. The core abstraction is minimal:

type RunAgent = (input: RunAgentInput) => Observable<BaseEvent>

About 16 event types are organized into categories:

  • LifecycleRUN_STARTED, RUN_FINISHED, RUN_ERROR
  • Text MessagesTEXT_MESSAGE_START, TEXT_MESSAGE_CONTENT, TEXT_MESSAGE_END
  • Tool CallsTOOL_CALL_START, TOOL_CALL_ARGS, TOOL_CALL_END
  • StateSTATE_SNAPSHOT, STATE_DELTA
  • ActivitySTEP_STARTED, STEP_FINISHED
  • ReasoningCUSTOM events for chain-of-thought traces

Two patterns recur throughout:

  1. Start-Content-End — text and tool calls arrive in streaming chunks, bracketed by lifecycle events. Clients can render progressively.
  2. Snapshot-Delta — state is initialized with a full snapshot, then updated incrementally via JSON Patch (RFC 6902). This keeps bandwidth low while maintaining consistency.

The protocol is transport-agnostic — SSE, WebSockets, binary frames, or webhooks all work. The events are the contract, not the transport.

Where AG-UI Fits in the Protocol Stack

AG-UI doesn't replace MCP or A2A — it's the missing third layer:

┌─────────────────────────────────────────────────┐
│ A2A (Google) Agent ↔ Agent coordination │
├─────────────────────────────────────────────────┤
│ MCP (Anthropic) Agent ↔ Tools & Data │
├─────────────────────────────────────────────────┤
│ AG-UI (CopilotKit) Agent ↔ User Interface │
└─────────────────────────────────────────────────┘

MCP tells agents what tools exist and how to call them. A2A lets agents delegate to other agents. AG-UI defines how the results of all that work reach the user in a streamable, interactive format. A production agent stack will typically use all three.

Human-in-the-Loop as a First-Class Concept

AG-UI treats human oversight as a protocol-level feature, not an afterthought:

  • Agents can pause execution by emitting interrupt events (tool_call, input_required, confirmation).
  • The client resumes the run with a typed response — approval, rejection, or edited parameters.
  • The approve-with-edits pattern lets users modify proposed tool arguments before execution.
  • Every interrupt creates a full audit trail: proposal → user decision → execution outcome.

This matters for enterprise deployments where agents shouldn't execute actions unattended.

Ecosystem Breadth

Adoption spans the major agent frameworks and frontend ecosystems:

  • Agent frameworks: Microsoft Agent Framework, Google ADK, AWS Bedrock AgentCore, LangGraph, CrewAI, Mastra, Pydantic AI, LlamaIndex
  • Frontend SDKs: React, Angular, Vue, React Native
  • Language SDKs: TypeScript, Python, Go, Kotlin, Rust, Ruby, Java (community)
  • Built on AG-UI: TanStack AI uses AG-UI as its wire protocol

The breadth matters because it means you can swap your agent backend (say, from LangGraph to CrewAI) without changing your frontend integration code.

Worth Watching

For teams already using TanStack AI or CopilotKit, you're already speaking AG-UI under the hood. For everyone else building agent-powered UIs, it's worth understanding the protocol before inventing a custom event format that you'll eventually need to migrate away from.