Orchestration Layer
Multi-Agent Patterns for AI-Accelerated Engineering
Implementation-Grade References
Use these pages when you want the working implementation details behind the orchestration model, not just the conceptual pattern descriptions.
What Is Orchestration in the Context of AI Coding Agents?
Orchestration is the discipline of coordinating multiple AI agents so they produce coherent, auditable, production-quality software -- rather than a collection of disconnected outputs that require extensive human stitching.
When a single developer uses a single AI assistant, orchestration is trivial: one human, one agent, one conversation. But modern AI-accelerated engineering rarely stays in that configuration. Teams scale by introducing multiple agents with specialized roles, running agents in parallel across different parts of a codebase, or chaining agents into multi-step pipelines. The moment a second agent enters the picture, someone (or something) must answer:
- Who works on what? Task decomposition and assignment.
- In what order? Sequencing, dependencies, and handoff triggers.
- With what constraints? Tool permissions, file access boundaries, quality thresholds.
- How do we know it worked? Validation gates, provenance tracking, audit trails.
Orchestration answers all four of these questions. It is the connective tissue between individual agent capabilities and team-level outcomes.
Orchestration vs. Prompting
Prompting tells an agent what to do. Orchestration tells the system how agents relate to each other, when they activate, and what happens when one fails. You can have excellent prompts and still produce chaos if agents step on each other's files, bypass quality gates, or produce outputs that no downstream agent can consume.
Orchestration vs. CI/CD
CI/CD pipelines execute after code is committed. Orchestration operates during the development process itself -- before the first commit, between agent handoffs, and as part of the authoring loop. Orchestration and CI/CD are complementary: orchestration governs the agent workflow, CI/CD validates the artifacts it produces.
The Three-Layer Model
AI-assisted software engineering operates across three distinct layers. Understanding where your tooling sits helps you avoid gaps and redundancy.
┌─────────────────────────────────────────────────────────┐
│ Layer 3: GOVERNANCE │
│ Standards, compliance, audit, provenance, policies │
│ ───────────────────────────────────────────────────── │
│ AEEF Production Standards (PRD-STD-001 through 016) │
│ Quality gates, AI-usage disclosure, KPI tracking │
├─────────────────────────────────────────────────────────┤
│ Layer 2: ORCHESTRATION │
│ Agent coordination, task routing, handoffs, isolation │
│ ───────────────────────────────────────────────────── │
│ AEEF Agent SDLC, CLI wrapper, branch-per-role model │
│ OpenClaw, CrewAI, claude-flow, LangGraph, Composio │
├─────────────────────────────────────────────────────────┤
│ Layer 1: AGENTS │
│ Individual AI models executing tasks │
│ ───────────────────────────────────────────────────── │
│ Claude Code, Cursor, Copilot, Windsurf, Devin, etc. │
└─────────────────────────────────────────────────────────┘
AEEF spans Layers 2 and 3. It provides both the governance standards (Layer 3) that constrain what agents may do and the orchestration patterns (Layer 2) that coordinate how they collaborate. Most competing frameworks address only one layer. AEEF's value proposition is that governance and orchestration are designed together, eliminating the integration gap that teams typically fill with ad hoc scripts.
Layer Interactions
- Layer 1 to Layer 2: Agents receive task assignments, tool permissions, and context windows from the orchestration layer.
- Layer 2 to Layer 3: Orchestration enforces governance rules at runtime -- contract compliance, quality gates, provenance logging.
- Layer 3 to Layer 1: Governance standards constrain agent behavior even when orchestration tooling changes. Standards are tool-agnostic; orchestration is tool-specific.
Why Single-Agent Workflows Break Down at Scale
Single-agent workflows -- one developer paired with one AI assistant -- work well for individual tasks. They break down predictably as complexity increases:
1. Context Window Saturation
A single agent handling requirements, architecture, implementation, and testing must hold the entire project context simultaneously. As codebases grow, the agent's effective context degrades. It forgets early decisions, contradicts its own architecture, or produces implementations that drift from the specification.
Multi-agent solution: Each agent holds only the context relevant to its role. The product agent holds the PRD. The architect holds the design document. The developer holds the implementation spec and the files it is modifying. Context is transferred between agents as structured handoff artifacts, not as a growing conversation history.
2. Conflicting Objectives
Requirements gathering rewards breadth and user empathy. Implementation rewards precision and edge-case handling. Quality assurance rewards skepticism and adversarial thinking. A single agent cannot optimize for all three simultaneously -- it tends to satisfice, producing mediocre results across all dimensions rather than excellent results in any.
Multi-agent solution: Specialized agents with role-specific system prompts and constraints. The product agent is instructed to think like a product manager. The QC agent is instructed to find flaws aggressively. Role specialization produces better outputs at each stage.
3. Merge Conflicts and File Contention
When a single agent works across multiple files sequentially, it produces a single coherent changeset. When multiple agents (or multiple instances of the same agent) work in parallel without coordination, they produce conflicting changes to shared files. Git merge conflicts in AI-generated code are harder to resolve than human merge conflicts because the AI's reasoning is not preserved in the diff.
Multi-agent solution: Isolation models (worktrees, branches, sandboxed directories) that give each agent exclusive ownership of its files. Orchestration determines which agent owns which files and resolves contention before it becomes a merge conflict.
4. Quality Gate Bypass
A single agent that both writes and reviews its own code is prone to confirmation bias. It will declare its own output satisfactory because it generated that output with the intent of satisfying the requirement. Self-review is structurally weaker than independent review.
Multi-agent solution: Separate agents for authoring and review, with quality gates that enforce minimum standards before handoff. The QC agent has different system prompts, different tool permissions, and different success criteria than the developer agent.
5. Audit Trail Collapse
When a single agent produces all artifacts in a single conversation, the audit trail is one long transcript. Extracting which decisions were made, when, and why requires parsing the entire conversation. Regulatory and compliance teams cannot efficiently review single-agent transcripts.
Multi-agent solution: Each agent produces discrete, structured artifacts (PRDs, ADRs, code changes, test reports) that are committed to version control with provenance metadata. The audit trail is the Git history, not a conversation log.
The Evolution: Vibe Coding to Agentic Engineering
Andrej Karpathy's widely cited progression captures how the industry has moved through distinct phases of AI-assisted development:
Phase 1: Autocomplete (2021-2023)
AI suggests the next line or block. The developer retains full control. Tools: GitHub Copilot, TabNine, Codeium. Orchestration need: none. One agent, one developer, inline suggestions.
Phase 2: Chat-Assisted Development (2023-2024)
The developer describes intent in natural language and the AI produces code blocks. The developer copy-pastes, adapts, and integrates. Tools: ChatGPT, Claude.ai, Cursor chat. Orchestration need: minimal. The developer is the orchestrator.
Phase 3: Vibe Coding (2024-2025)
The developer describes a feature at a high level and the AI generates entire files, runs tests, and iterates. The developer reviews and accepts or rejects. The term "vibe coding" captures the feel-good flow of rapid generation, but also the risk: the developer may accept code they do not fully understand. Tools: Cursor Composer, Claude Code (single agent), Windsurf Cascade. Orchestration need: emerging. Quality gates and review processes become important.
Phase 4: Agentic Engineering (2025-present)
Multiple specialized AI agents collaborate through defined workflows with human oversight at key decision points. Agents have roles, constraints, and handoff protocols. The human sets direction, reviews critical decisions, and approves promotions to production. Tools: AEEF CLI, CrewAI, claude-flow, LangGraph, Composio. Orchestration need: critical. This is where orchestration frameworks become essential.
Phase 5: Autonomous Software Factories (emerging)
Fully automated pipelines where agents handle the entire SDLC from issue to deployment with human oversight limited to strategic decisions and exception handling. This phase is emerging but not yet production-ready for most organizations. Orchestration need: maximum. Governance, audit, and rollback capabilities must be robust enough to operate without continuous human intervention.
AEEF is designed for Phase 4 with a path to Phase 5. Its governance layer (quality gates, provenance, disclosure) provides the safety net that makes increasing autonomy possible.
Orchestration Patterns
Five core patterns appear across all multi-agent orchestration systems. Most real-world deployments combine two or more patterns. AEEF's reference implementations demonstrate each one.
Pattern 1: Sequential Pipeline
┌──────────┐ ┌───────────┐ ┌───────────┐ ┌────────┐
│ Product │───>│ Architect │───>│ Developer │───>│ QC │
│ Agent │ │ Agent │ │ Agent │ │ Agent │
└──────────┘ └───────────┘ └───────────┘ └────────┘
│ │ │ │
▼ ▼ ▼ ▼
PRD.md DESIGN.md Code + Tests Test Report
→ merge to main
How it works: Agents execute in a fixed order. Each agent's output becomes the next agent's input. A quality gate between each stage validates that the output meets minimum standards before the handoff proceeds.
AEEF implementation: This is AEEF's baseline pattern, implemented in the CLI wrapper's branch-per-role model. The aeef --role=product command starts the pipeline. Each role produces a PR that the next role consumes.
Strengths:
- Simple to understand, debug, and audit
- Natural alignment with traditional SDLC phases
- Clear ownership and accountability at each stage
- Easy to add human approval gates between stages
Weaknesses:
- Slowest pattern (fully serial execution)
- Bottleneck at any single stage delays the entire pipeline
- Overkill for small changes that do not need full SDLC treatment
Best for: Feature development, regulatory-compliant workflows, teams new to multi-agent systems.
Pattern 2: Parallel Swarm
┌─────────────┐
┌───>│ Agent A │───┐
│ │ (module-1/) │ │
┌──────────┐ │ └─────────────┘ │ ┌───────────┐
│ Task │───┤ ┌─────────────┐ ├───>│ Merge + │
│ Splitter │ ├───>│ Agent B │───┤ │ Validate │
└──────────┘ │ │ (module-2/) │ │ └───────────┘
│ └─────────────┘ │
│ ┌─────────────┐ │
└───>│ Agent C │───┘
│ (module-3/) │
└─────────────┘
How it works: A coordinator splits the work into independent subtasks and dispatches them to multiple agents running simultaneously. Each agent works on its own subset of files. Results are merged when all agents complete.
AEEF implementation: Not a native AEEF pattern, but AEEF constraints (file ownership boundaries, quality gates) can be applied to each agent in the swarm. Tools like claude-flow and ccswarm implement this pattern and can consume AEEF contracts as agent constraints.
Strengths:
- Fastest pattern for parallelizable work
- Scales linearly with independent subtask count
- Agents do not block each other
Weaknesses:
- Requires clean task decomposition (no shared files)
- Merge step can be complex if agents produce conflicting changes
- Harder to audit because events are interleaved across agents
- Coordination overhead increases with agent count
Best for: Large refactoring across independent modules, monorepo work, test generation across isolated components.
Pattern 3: Hierarchical Delegation
┌───────────────┐
│ Master Agent │
│ (Planner) │
└───────┬───────┘
│
┌───────────┼───────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Worker A │ │ Worker B │ │ Worker C │
│ (API) │ │ (UI) │ │ (Tests) │
└──────────┘ └──────────┘ └──────────┘
│ │ │
└───────────┼───────────┘
▼
┌───────────────┐
│ Master Agent │
│ (Reviewer) │
└───────────────┘
How it works: A master agent (sometimes called the "queen" or "coordinator") decomposes a high-level task into subtasks, delegates them to specialized worker agents, collects results, and synthesizes the final output. The master may also review and request revisions from workers.
AEEF implementation: The AEEF CLI's architect role naturally functions as a hierarchical delegator when combined with a swarm tool. The architect produces a design document that decomposes the feature into implementation tasks, and each task can be dispatched to a separate developer agent. AEEF's contracts constrain what each worker agent may do.
Strengths:
- Handles complex, multi-faceted tasks well
- Master agent maintains the big picture
- Workers can be specialized for different domains
- Natural fit for project management workflows
Weaknesses:
- Master agent is a single point of failure
- Master's context window must be large enough to hold the overall plan
- Communication overhead between master and workers
- Risk of "telephone game" where intent degrades through delegation layers
Best for: Feature development with multiple components, cross-cutting changes, architecture-driven implementation.
Pattern 4: Peer Review Loop
┌───────────┐ ┌───────────┐
│ Agent A │────────>│ Agent B │
│ (Author) │ │ (Reviewer)│
└───────────┘ └─────┬─────┘
▲ │
│ ┌─────────┐ │
└─────│ Quality │<───┘
│ Gate │
└─────────┘
│
Pass? ──> Merge
Fail? ──> Agent A revises
How it works: One agent produces an artifact and a second agent reviews it against defined criteria. If the review identifies issues, the author agent revises and resubmits. The loop continues until the quality gate passes or a maximum iteration count is reached.
AEEF implementation: AEEF's QC agent functions as the reviewer in this pattern. The developer agent authors code, the QC agent reviews against AEEF standards and quality gates. The handoff mechanism (PRs with structured review criteria) supports iterative revision. The aeef --role=qc invocation includes stop hooks that enforce quality thresholds before the PR can merge to main.
Strengths:
- Produces higher-quality output than single-pass generation
- Catches errors that the author agent's confirmation bias would miss
- Natural alignment with code review culture
- Bounded by iteration limit to prevent infinite loops
Weaknesses:
- Slower than single-pass (minimum 2x latency)
- Reviewer agent may introduce its own biases
- Convergence is not guaranteed -- agents may oscillate on subjective issues
- Requires clear, measurable review criteria to avoid subjective deadlocks
Best for: High-stakes code (security, financial logic, infrastructure), compliance-sensitive changes, teams that value code review culture.
Pattern 5: Git-Branch Isolation
main ─────────────────────────────────────────────► main
│ ▲
├──► worktree/agent-1 ──► commit ──► PR ──┐ │
│ ├──► merge
├──► worktree/agent-2 ──► commit ──► PR ──┤ │
│ │ │
└──► worktree/agent-3 ──► commit ──► PR ──┘ │
│
CI validates
each PR before
merge
How it works: Each agent operates in its own Git branch or worktree. Agents cannot directly modify each other's files because they are working in separate filesystem snapshots. Coordination happens through Git (merges, PRs, rebases) rather than through shared memory or message passing.
AEEF implementation: This is AEEF's primary isolation model. The CLI wrapper creates branches (aeef/product, aeef/architect, aeef/dev, aeef/qc) and each agent operates exclusively within its branch. PRs serve as the coordination mechanism. Composio's agent-orchestrator uses the same pattern -- branch-per-agent with CI validation.
Strengths:
- Strongest isolation model -- agents cannot corrupt each other's work
- Full Git history provides complete audit trail
- CI/CD validates each agent's output independently
- Familiar model for any team that uses Git
- Easy rollback -- just discard the branch
Weaknesses:
- Merge conflicts when agents touch overlapping files
- Branch management overhead increases with agent count
- Slower than in-memory coordination for tightly coupled tasks
- Requires Git proficiency from the orchestration layer
Best for: Any production deployment. AEEF recommends Git-branch isolation as the default isolation model for all multi-agent workflows.
Pattern Decision Matrix
Use this matrix to select the right pattern (or combination) for your situation:
| Factor | Sequential Pipeline | Parallel Swarm | Hierarchical | Peer Review | Git-Branch |
|---|---|---|---|---|---|
| Task independence | Low (serial deps) | High (required) | Medium | Low (paired) | High |
| Speed | Slowest | Fastest | Medium | Slow | Medium |
| Audit clarity | Excellent | Fair | Good | Good | Excellent |
| Isolation strength | Strong (branch) | Variable | Variable | Moderate | Strongest |
| Complexity to implement | Low | Medium | High | Low | Medium |
| Max concurrent agents | 1 | Unlimited | 1 master + N | 2 | N |
| Failure recovery | Restart from failed stage | Retry individual agent | Retry delegation | Iterate | Discard branch |
| AEEF native support | Yes (CLI baseline) | Via integrations | Via integrations | Yes (QC loop) | Yes (CLI baseline) |
Common Pattern Combinations
Real-world deployments rarely use a single pattern in isolation. Here are proven combinations:
Sequential + Peer Review (AEEF default) The sequential pipeline handles the macro flow (product to architect to developer to QC), while peer review loops operate within stages. For example, the developer agent may run an internal review loop before handing off to QC.
Hierarchical + Parallel Swarm The master agent decomposes the task and dispatches subtasks to a parallel swarm of workers. This is the pattern used by claude-flow's queen-led swarm. AEEF contracts can constrain each worker in the swarm.
Sequential + Git-Branch Isolation Each stage in the sequential pipeline operates in its own Git branch. This is exactly what the AEEF CLI implements. It combines the clarity of sequential flow with the isolation strength of Git branches.
Parallel Swarm + Git-Branch Isolation Multiple agents work simultaneously, each in their own worktree or branch. Merges happen when all agents complete. This is the pattern used by ccswarm and Composio.
How AEEF's Agent Models Implement These Patterns
The 4-Agent Baseline Model
AEEF's 4-agent model (Product, Architect, Developer, QC) implements a Sequential Pipeline with Git-Branch Isolation and Peer Review at the QC stage.
┌──────────────────────────────────────────────────────────────┐
│ AEEF 4-Agent Model │
│ │
│ Pattern: Sequential Pipeline + Git-Branch Isolation │
│ Isolation: Branch-per-role (aeef/product → aeef/qc) │
│ Handoff: Pull Requests with structured artifacts │
│ Governance: Hook-based contract enforcement │
│ Quality: Gate checks at each handoff + final QC review │
│ │
│ ┌─────────┐ ┌──────────┐ ┌─────────┐ ┌────────┐ │
│ │ Product │───>│Architect │───>│Developer│───>│ QC │ │
│ │ Agent │ │ Agent │ │ Agent │ │ Agent │ │
│ └─────────┘ └──────────┘ └─────────┘ └────────┘ │
│ │ │ │ │ │
│ PRD.md DESIGN.md Code+Tests Test Report │
│ aeef/product aeef/architect aeef/dev aeef/qc │
│ │ │ │ │ │
│ └──── PR ──────┘───── PR ──────┘──── PR ─────┘ │
│ │ │
│ PR to main │
└──────────────────────────────────────────────────────────────┘
Agent contracts restrict each role:
- Product agent: May create/edit markdown files only. No code, no tests.
- Architect agent: May create/edit design documents and configuration. No application code.
- Developer agent: May create/edit application code, tests, and configuration. Must follow architect's design.
- QC agent: May run tests, write test reports, and approve/reject. May not modify application code.
Quality gates enforce at each handoff:
- PRD completeness check (product to architect)
- Design coverage check (architect to developer)
- Test coverage threshold (developer to QC)
- Full compliance check (QC to main)
The 11-Agent Production Model
AEEF's 11-agent model extends the 4-agent baseline for enterprise and regulated environments. It adds specialized agents for security, compliance, operations, documentation, and more.
┌────────────────────────────────────────────────────────────────────┐
│ AEEF 11-Agent Production Model │
│ │
│ Patterns: Sequential + Hierarchical + Peer Review │
│ │
│ Core Pipeline: │
│ Product → Architect → Developer → QC → Release │
│ │
│ Specialist Agents (triggered by context): │
│ ┌────────────┐ ┌──────────────┐ ┌─────────────┐ │
│ │ Security │ │ Compliance │ │ DevOps │ │
│ │ Agent │ │ Agent │ │ Agent │ │
│ └────────────┘ └──────────────┘ └─────────────┘ │
│ ┌────────────┐ ┌──────────────┐ ┌─────────────┐ │
│ │ Docs │ │ Incident │ │ Monitor │ │
│ │ Agent │ │ Agent │ │ Agent │ │
│ └────────────┘ └──────────────┘ └─────────────┘ │
│ │
│ Orchestration: Hierarchical delegation from Architect │
│ to specialist agents, with peer review between Security │
│ and Developer, Compliance and Release. │
└────────────────────────────────────────────────────────────────────┘
The 11-agent model introduces hierarchical delegation at the architect stage: the architect agent can delegate specialized tasks to the security, compliance, or DevOps agents. It also introduces peer review loops between specialist agents: the security agent reviews the developer's output, and the compliance agent reviews the release agent's output.
See the Agent Orchestration Model page for the full role map and the Production Tier for the implementation.
Choosing an Orchestration Strategy
Decision Tree
START
│
├─ Single developer, small project?
│ └─ No orchestration needed. Use AEEF CLI with --role=developer.
│
├─ Small team (2-5), sequential feature work?
│ └─ Sequential Pipeline (AEEF 4-agent model).
│ Use: aeef --role=product → architect → developer → qc
│
├─ Large codebase, independent modules?
│ └─ Parallel Swarm + Git-Branch Isolation.
│ Use: claude-flow, ccswarm, or claude-squad with AEEF contracts.
│
├─ Complex feature spanning multiple domains?
│ └─ Hierarchical Delegation + Sequential Pipeline.
│ Use: AEEF 11-agent model with architect as coordinator.
│
├─ High-stakes code requiring rigorous review?
│ └─ Peer Review Loop + Sequential Pipeline.
│ Use: AEEF QC agent with iteration loops.
│
└─ Regulated environment requiring full audit?
└─ Sequential Pipeline + Git-Branch Isolation + Peer Review.
Use: AEEF Production Tier (11-agent model with all governance).
Factors to Weigh
- Team size: Small teams benefit from sequential simplicity. Large teams need parallel patterns to avoid bottlenecks.
- Regulatory requirements: Regulated industries need full audit trails. Sequential pipelines with Git-branch isolation provide the strongest audit model.
- Task independence: Highly coupled tasks require sequential or hierarchical patterns. Independent tasks benefit from parallel swarms.
- Risk tolerance: Higher risk requires more review loops and quality gates. Lower risk allows faster, more parallel patterns.
- Tool maturity: If your orchestration tool is new or experimental, start with the simplest pattern (sequential) and add complexity as you gain confidence.
Anti-Patterns to Avoid
The Free-for-All
Multiple agents with no coordination, no file ownership, no quality gates. They all modify the same files and produce merge conflicts that no one can resolve. This is not orchestration -- it is chaos.
The Single Agent Pretending to Be Many
One agent with multiple "personas" switching between roles in a single conversation. This provides the overhead of multi-agent without the benefits (no isolation, no independent review, no parallel execution).
Quality Gate Theater
Quality gates that always pass. If your QC agent approves every change, you do not have a quality gate -- you have a rubber stamp. Quality gates must have teeth: defined thresholds, real rejection criteria, and the authority to block handoffs.
Over-Orchestration
Using an 11-agent model for a 50-line bug fix. Orchestration overhead should be proportional to task complexity. AEEF supports this through role selection: aeef --role=developer for simple tasks, the full pipeline for feature work.
Ignoring the Human
Fully autonomous agent pipelines that produce "done" artifacts without human review at any stage. Humans should be in the loop for architectural decisions, security-sensitive changes, and final approval. Orchestration should make human review efficient, not eliminate it.
Key Takeaways
- Orchestration is not optional at scale. The moment you have more than one agent, you need coordination.
- AEEF spans Layers 2 and 3 (Orchestration and Governance), giving you both coordination patterns and compliance enforcement in a single framework.
- Start with the Sequential Pipeline. It is the simplest pattern and the easiest to audit. Add parallel and hierarchical patterns as your team's needs and confidence grow.
- Git-Branch Isolation is the safest default. It provides strong isolation, full audit trails, and integrates with existing CI/CD systems.
- Combine patterns as needed. Real-world deployments use 2-3 patterns together. The decision matrix and decision tree above help you select the right combination.
- Governance makes autonomy possible. Quality gates, contracts, and provenance tracking are not overhead -- they are the safety net that allows you to increase agent autonomy over time.
Next Steps
- Multi-Agent Tools Comparison -- Evaluate orchestration tools and find the right one for your stack.
- Integration Patterns -- Concrete code examples for using AEEF with CrewAI, claude-flow, LangGraph, and more.
- AEEF CLI Wrapper -- Get started with AEEF's native orchestration in 30 minutes.
- Agent Orchestration Model -- The canonical 4-agent and 11-agent model specifications.