Multi-Agent Orchestration
A single AI agent has limits — context window size, task scope, and specialization. Multi-agent systems address these by having multiple agents collaborate, each with a defined role. This page covers the patterns and frameworks that make this practical.
What Is Multi-Agent Orchestration?
Multiple AI agents working toward a shared goal, each contributing a distinct capability. An orchestrator (planner agent) decomposes the goal into subtasks and dispatches them to subagents (worker agents). Results flow back to the orchestrator, which synthesizes the final output.
This enables:
- Parallelism — independent subtasks run simultaneously
- Specialization — each agent is prompted for one thing it does well
- Scale — work can exceed a single agent's context window by passing summaries between agents
Multi-agent systems add significant complexity. They're worth it only when a single agent genuinely can't do the job. Start with a single agent — see AI Coding Agents — and escalate only when you hit real limits.
Common Patterns
Orchestrator + Subagents
A central planner receives the goal, breaks it into tasks, dispatches them to specialized subagents, and synthesizes the results.
Good for: complex tasks with multiple distinct phases — for example, a pipeline that researches requirements, writes code, writes tests, and produces documentation, all from a single prompt to the orchestrator.
Parallel Execution
Independent subtasks run simultaneously and their results are merged afterward.
Good for: tasks that decompose cleanly without dependencies — processing multiple files, running multiple research queries, analyzing different parts of a large codebase concurrently.
Specialist Handoff (Pipeline)
A sequential chain: Agent A completes its task and passes the output as input to Agent B, which passes to Agent C, and so on.
Example: Researcher → Coder → Code Reviewer. Each agent is given a tightly scoped role and receives the structured output of the previous agent as context.
Good for: multi-stage workflows where each stage's output is the next stage's input, and you want each stage to be independently inspectable and replaceable.
Defining Agents with Instruction Files
Most multi-agent frameworks let you define each agent's role, goals, and constraints as a plain text or Markdown file. In Docusaurus-based or GitHub-hosted projects, a common convention is to store these files in a .github/agents/ directory.
The .github/agents/ Convention
Each file describes one agent — its purpose, the tools it can use, the output format it should produce, and any constraints it must respect. A minimal example:
# Researcher Agent
## Role
You are a research assistant. You gather information from web sources and existing documentation.
## Tools
- web_search
- read_file
## Output format
Return a structured research brief with: key findings, source URLs, and a recommended document outline.
## Constraints
- Do not write or edit any files.
- Always cite your sources.
- Focus only on the topic given in the task prompt.
The orchestrator reads these files at startup and uses them to configure each subagent.
You don't need to write agent instruction files from scratch. Describe the role you need to an AI assistant and ask it to generate the .md file for you. For example: "Write an agent instruction file for a code-review agent that checks TypeScript files for type safety issues and style consistency."
Frameworks
Python
- CrewAI — role-based agent definitions with straightforward YAML configuration. Good for getting started quickly with multi-agent workflows.
- AutoGen (Microsoft) — conversational multi-agent framework with strong support for code execution and testing within the agent loop.
- LangGraph — graph-based workflow orchestration built on LangChain. Fine-grained control over agent state and transitions; best for complex, stateful workflows.
TypeScript / Node.js
- Mastra — TypeScript-first agent framework with a built-in workflow engine, memory management, and tool integrations.
If your team lives in TypeScript and Node.js — as most Aliz web teams do — prefer Mastra over the Python frameworks. Keeping your stack consistent reduces context-switching and means your agent code can share types and utilities with the rest of the project.
Practical Considerations
- Context passing — design explicit handoff schemas. What state does each agent receive? Passing too little loses important context; passing too much wastes tokens and increases cost.
- Cost — every agent invocation is an API call. Multi-agent systems multiply your token spend quickly. Set spending budgets and monitor usage before running at scale.
- Error propagation — a hallucination in Agent A becomes assumed fact for Agent B. Add lightweight validation between agents in high-stakes pipelines.
- Observability — log every agent's input, output, and tool calls. You cannot debug what you cannot observe.
- Human checkpoints — add approval steps before any non-reversible action: deploying, sending external communications, writing to a production database.
Multi-agent frameworks are evolving rapidly. Breaking API changes between minor versions are common. Pin your dependency versions and budget time for upgrades when you update.
When It's Worth It
- Tasks that genuinely benefit from parallel specialization — a research + implementation + review pipeline is a classic example
- Long-horizon tasks that exceed a single agent's context window
- When you can break a problem into independently testable, loosely coupled concerns
- Batch processing — running many similar tasks in parallel, such as multi-file analysis or bulk content generation