Chapter 7: Multi-Agent Collaboration

Why One Agent Isn’t Enough

Before You Start — Key Terms Explained

Specialization: Focusing on a narrow domain to become very good at it. A specialist cardiologist is better at heart problems than a general practitioner, even though the GP knows more topics overall. The same principle applies to AI agents.

Orchestrator: The agent that manages and coordinates other agents. It receives the high-level goal, decomposes it, assigns sub-tasks, and aggregates results. Think of it as the project manager in a team.

Sequential vs parallel: Sequential = one after another (like a queue at a counter). Parallel = simultaneously (like multiple checkout lanes open at once). Sequential is simpler to reason about; parallel is faster for independent tasks.

Why does more roles = worse LLM performance? When you stuff too many instructions into one system prompt ("be a researcher AND a writer AND a fact-checker"), the model splits its "attention" across all goals. It tends to do each poorly. Separate, focused system prompts produce better results for each role.

Every chapter up to now has been about making a single agent smarter — chain it, route it, run it in parallel, give it tools, make it reflect, make it plan. All of these improve a single agent’s performance.

But there’s a ceiling — and it’s a hard one.

The ceiling shows up in several distinct ways:

Knowledge breadth vs. depth. A single agent trying to be a domain expert in multiple fields simultaneously is like a company that puts one person in charge of engineering, legal, marketing, and finance all at once. Each role gets diluted attention, and the person’s expertise in any one domain never reaches the depth needed to excel.

System prompt dilution. When you write a system prompt that tries to make one agent do many things — “You are a researcher AND a writer AND a code reviewer AND a fact-checker” — you’re asking the model to activate multiple behavioral patterns simultaneously. These patterns often have conflicting heuristics. A researcher’s instinct is to gather more information; a writer’s instinct is to synthesize and move on. Held in tension within one agent, they produce mediocre results in both directions.

Context window saturation. A complex multi-domain task accumulates a lot of context: research findings, intermediate drafts, tool outputs, conversation history. A single agent accumulates all of this in one context window, which fills up fast. Multiple specialized agents each have their own context window focused only on their domain.

Error propagation. When a single agent makes a mistake early in a long task, that mistake often propagates through every subsequent step. In a multi-agent system, errors are isolated — a mistake by the Research Agent affects only the research phase, and the Writer Agent can still produce a good draft given the flawed research (though the final output will reflect the research quality).

The multi-agent solution addresses all four of these ceilings simultaneously: specialization deepens expertise, dedicated context windows stay focused, errors are isolated, and coordination mechanisms ensure coherent final outputs.

A single agent handling a complex research project needs to be: a domain expert, a search specialist, a statistical analyst, a fact-checker, and a polished writer — simultaneously. The more roles you pile onto one system prompt, the worse it performs at each of them.

The same problem that prompted humans to build teams applies to agents: specialization beats generalism at scale.

Multi-agent collaboration breaks a complex goal into sub-tasks, assigns each to an agent built specifically for that task, and coordinates their outputs into a unified result. The researcher does the research. The analyst runs the numbers. The writer assembles the draft. The reviewer catches errors. Each agent excels at its role. The system as a whole exceeds what any single agent could produce.

What Multi-Agent Collaboration Is

Multi-agent collaboration is a system design pattern where:

A complex goal is decomposed into discrete sub-tasks
Each sub-task is assigned to a specialized agent with the right tools, knowledge, or reasoning approach
Agents coordinate through defined communication protocols — passing outputs, delegating tasks, sharing state
A final output emerges from the coordinated work of the team

MULTI-AGENT COLLABORATION PATTERN

Complex Goal

Too large for any single agent

Orchestrator Agent

Decomposes goal · delegates to specialists

Specialist Agents — each owns a distinct role

Separate tools, knowledge, and system prompts per agent

Research Agent

Search · retrieve

Analysis Agent

Process · compute

Writer Agent

Draft · format

Synthesis Agent

Merges all specialist outputs · resolves conflicts

Final Result

Exceeds what any single agent could produce

The critical ingredient is inter-agent communication: a standardized way for agents to exchange data, delegate work, and signal completion. Without it, you just have isolated agents running in parallel — not a team.

The Six Interaction Models

Agent teams can be structured in fundamentally different ways. The choice of structure changes autonomy, fault tolerance, complexity, and scalability.

Single Agent

One agent handles everything autonomously. Simple to build and manage, but constrained by a single agent's scope and resources. No inter-agent communication overhead. Suitable only when the task is fully self-contained.

Autonomy

Fault tolerance

Scalability

Complexity

Best for: simple, self-contained tasks with no need for specialization

The Five Collaboration Patterns

Within multi-agent systems, agents interact through five fundamental patterns:

→

Sequential Handoff

Agent A completes its task and passes its output directly to Agent B. Output of A becomes input to B. Clean pipeline with clear dependencies.

Research → Write → Edit → Publish

⇉

Parallel Workstreams

Multiple agents work on independent sub-tasks simultaneously. Results are merged by a synthesizer. Reduces total latency for independent work.

News + Weather + Stocks → Report

⇄

Debate & Consensus

Agents with different perspectives evaluate options and discuss. The group reaches a more informed decision than any single agent could.

Pro + Con agents → Moderator → Verdict

↕

Hierarchical Delegation

A manager agent dynamically assigns sub-tasks to worker agents based on their tool access. Workers report back, manager synthesizes.

Coordinator → Specialist A / B / C → Result

✓✗

Critic-Reviewer

A creator agent produces output. A critic agent evaluates it for quality, compliance, or correctness. Creator revises based on critique.

Generator → Critic → Revised Output

Watch a Two-Agent Team Work

A Researcher and a Writer collaborate on a blog post. Click Run Team to watch the sequential handoff:

MULTI-AGENT TEAM DEMO — Research + Write Pipeline

SHARED GOAL Write a blog post on the top 3 AI trends in 2025

🔍

Research Agent

Senior Research Analyst

Idle

→

✍

Writer Agent

Technical Content Writer

Waiting

Use Cases Across Domains

Complex Research

Searcher agent finds sources, summarizer agent digests them, trend-spotter identifies patterns, synthesizer writes the final report.

Software Development

Requirements analyst, code generator, test writer, and documentation agent collaborate — each handing off to the next in sequence.

Financial Analysis

Agents fetch stock data, analyze news sentiment, run technical analysis, and generate investment recommendations in parallel.

Customer Support

Front-line agent handles general queries. Specialist agents (billing, technical, escalation) are activated only when needed.

Supply Chain

Agents representing suppliers, manufacturers, and distributors collaborate to optimize inventory and logistics in real time.

Network Remediation

Multiple triage agents work concurrently to pinpoint failures. Each integrates with existing ML tooling while contributing to a coordinated remediation plan.

CrewAI: Researcher + Writer Team

CrewAI is designed for exactly this — defining a team of agents with distinct roles, assigning tasks, and orchestrating collaboration.

from crewai import Agent, Task, Crew, Process
from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash")

Defining the Agents

researcher = Agent(
    role      = 'Senior Research Analyst',
    goal      = 'Find and summarize the latest trends in AI.',
    backstory  = "You are an experienced research analyst with a knack for identifying key trends and synthesizing information.",
    verbose          = True,
    allow_delegation = False,
)

writer = Agent(
    role      = 'Technical Content Writer',
    goal      = 'Write a clear and engaging blog post based on research findings.',
    backstory  = "You are a skilled writer who can translate complex technical topics into accessible content.",
    verbose          = True,
    allow_delegation = False,
)

Two agents, two roles, zero overlap. The researcher knows nothing about writing; the writer knows nothing about searching. This clean separation means each agent’s system prompt is focused — no role confusion, no competing instructions.

allow_delegation=False — prevents each specialist from spawning sub-agents on their own. In a controlled pipeline, you want the delegation logic to live in the Crew definition, not in individual agents.

Defining the Tasks (with dependency)

research_task = Task(
    description      = "Research the top 3 emerging AI trends in 2024-2025. Focus on practical applications and impact.",
    expected_output  = "A detailed summary of the top 3 AI trends with key points and sources.",
    agent            = researcher,
)

writing_task = Task(
    description     = "Write a 500-word blog post based on the research findings. Engaging, accessible to a general audience.",
    expected_output = "A complete 500-word blog post about the latest AI trends.",
    agent           = writer,
    context         = [research_task],   # ← explicit dependency
)

context=[research_task] is the sequential handoff mechanism. It tells CrewAI: “the writing task cannot begin until research_task is complete, and the research output should be available to the writer as context.”

Without this, the writer would run without access to the researcher’s findings — generating generic content rather than building on real research.

Assembling the Crew

blog_creation_crew = Crew(
    agents  = [researcher, writer],
    tasks   = [research_task, writing_task],
    process = Process.sequential,
    llm     = llm,
    verbose = 2,
)

result = blog_creation_crew.kickoff()

Process.sequential ensures tasks run in the order they’re listed — researcher first, writer second. The context dependency enforces this even if the process allows parallelism.

llm at the Crew level sets a default model for all agents. Individual agents can still override with their own llm= parameter if specialization is needed — e.g., a more powerful model for the researcher, a faster one for the writer.

CrewAI Data Flow

CREWAI DATA FLOW — research output becomes writer input via context=[task]

Topic

"Top 3 AI trends 2025"

→

Researcher Agent

Senior Research Analyst · finds sources · summarizes trends

→

Crew Context

research output stored · context=[research_task] passes it automatically

→

Writer Agent

Technical Content Writer · sees research output · writes blog post

→

500-word Blog Post

Google ADK: Four Orchestration Patterns

The ADK provides explicit primitives for each collaboration structure. Here are the four most important.

1. Hierarchical: Parent-Child Agents

from google.adk.agents import LlmAgent, BaseAgent
from google.adk.agents.invocation_context import InvocationContext
from google.adk.events import Event
from typing import AsyncGenerator

class TaskExecutor(BaseAgent):
    """A specialized agent with custom, non-LLM behavior."""
    name:        str = "TaskExecutor"
    description: str = "Executes a predefined task."

    async def _run_async_impl(
        self, context: InvocationContext
    ) -> AsyncGenerator[Event, None]:
        yield Event(author=self.name, content="Task finished successfully.")

BaseAgent is the extension point for non-LLM agents. When you need deterministic logic, API calls, or custom execution — instead of LLM reasoning — subclass BaseAgent and implement _run_async_impl. It yields Event objects as output, integrating cleanly with the ADK event stream.

greeter     = LlmAgent(name="Greeter", model="gemini-2.0-flash-exp", instruction="You are a friendly greeter.")
task_doer   = TaskExecutor()

coordinator = LlmAgent(
    name        = "Coordinator",
    model       = "gemini-2.0-flash-exp",
    instruction = "When asked to greet, delegate to the Greeter. When asked to perform a task, delegate to the TaskExecutor.",
    sub_agents  = [greeter, task_doer],
)

# ADK sets parent_agent automatically
assert greeter.parent_agent   == coordinator
assert task_doer.parent_agent == coordinator

sub_agents establishes the parent-child relationship. The coordinator’s LLM reads both agents’ descriptions and decides which to delegate to. The parent_agent attribute is set automatically by the ADK framework — you don’t need to wire it manually.

ADK HIERARCHICAL AGENTS — parent delegates to children via sub_agents

Coordinator LlmAgent

reads user query · decides which child to delegate to based on each agent's description

Greeter LlmAgent

activated when greeting is needed · LLM-powered · uses Gemini

TaskExecutor BaseAgent

custom logic · non-LLM · runs deterministic code

2. LoopAgent: Iterative Workflows

from google.adk.agents import LoopAgent, LlmAgent, BaseAgent
from google.adk.events import Event, EventActions

class ConditionChecker(BaseAgent):
    """Stops the loop when session state signals completion."""
    name:        str = "ConditionChecker"
    description: str = "Checks if a process is complete and signals the loop to stop."

    async def _run_async_impl(self, context: InvocationContext) -> AsyncGenerator[Event, None]:
        status  = context.session.state.get("status", "pending")
        is_done = (status == "completed")
        if is_done:
            yield Event(author=self.name, actions=EventActions(escalate=True))  # stop loop
        else:
            yield Event(author=self.name, content="Condition not met, continuing loop.")

EventActions(escalate=True) is the loop-termination signal. When ConditionChecker yields this, the LoopAgent stops regardless of remaining iterations. Without it, the loop runs to max_iterations.

context.session.state is the shared key-value store accessible to all agents within a session. ProcessingStep writes "status": "completed" there; ConditionChecker reads it. This is how agents communicate state in ADK without direct message passing.

process_step = LlmAgent(
    name        = "ProcessingStep",
    model       = "gemini-2.0-flash-exp",
    instruction = "Perform your task. If you are the final step, set session state 'status' to 'completed'.",
)

poller = LoopAgent(
    name           = "StatusPoller",
    max_iterations = 10,
    sub_agents     = [process_step, ConditionChecker()],
)

LoopAgent runs its sub_agents sequentially on each iteration. Each iteration: ProcessingStep executes → ConditionChecker evaluates → if not done, repeat. Maximum 10 iterations as a safety cap. Use this for polling, retry workflows, and iterative refinement loops.

3. SequentialAgent: Linear Pipelines

from google.adk.agents import SequentialAgent, Agent

step1 = Agent(name="Step1_Fetch",   output_key="data")
step2 = Agent(name="Step2_Process", instruction="Analyze the information in state['data'] and provide a summary.")

pipeline = SequentialAgent(name="MyPipeline", sub_agents=[step1, step2])

output_key="data" — when Step1_Fetch completes, its output is automatically stored in session.state["data"]. Step2_Process reads it from state via its instruction. This is the ADK equivalent of CrewAI’s context=[task] — explicit state-passing between sequential steps.

4. ParallelAgent: Concurrent Workers

from google.adk.agents import Agent, ParallelAgent

weather_fetcher = Agent(
    name        = "weather_fetcher",
    model       = "gemini-2.0-flash-exp",
    instruction = "Fetch the weather for the given location and return only the report.",
    output_key  = "weather_data",   # → session.state["weather_data"]
)

news_fetcher = Agent(
    name        = "news_fetcher",
    model       = "gemini-2.0-flash-exp",
    instruction = "Fetch the top news story for the given topic and return only that story.",
    output_key  = "news_data",      # → session.state["news_data"]
)

data_gatherer = ParallelAgent(
    name       = "data_gatherer",
    sub_agents = [weather_fetcher, news_fetcher],
)

ParallelAgent fires both sub-agents concurrently. Each writes to a different output_key in session state — weather_data and news_data — with no conflict since the keys are distinct. After both complete, a downstream agent (or your application code) can read both from state.

ADK PARALLEL AGENT — two workers fire simultaneously, results in shared state

ParallelAgent

fires both sub-agents at the exact same time — doesn't wait for one before starting the other

weather_fetcher

output_key: "weather_data" → written to Session State

news_fetcher

output_key: "news_data" → written to Session State

Session State

shared key-value store · both results land here · no conflict since keys are different

Synthesizer reads both

downstream agent combines weather_data + news_data into final response

5. Agent as a Tool

from google.adk.agents import LlmAgent
from google.adk.tools import agent_tool

def generate_image(prompt: str) -> dict:
    """Generates an image from a textual prompt. Returns image bytes."""
    mock_bytes = b"mock_image_data_for_a_cat_wearing_a_hat"
    return {"status": "success", "image_bytes": mock_bytes, "mime_type": "image/png"}

image_generator_agent = LlmAgent(
    name        = "ImageGen",
    model       = "gemini-2.0-flash",
    description = "Generates an image based on a detailed text prompt.",
    instruction = "Take the user's request and use the generate_image tool to create the image.",
    tools       = [generate_image],
)

image_tool = agent_tool.AgentTool(
    agent       = image_generator_agent,
    description = "Use this to generate an image. Input should be a descriptive prompt.",
)

artist_agent = LlmAgent(
    name        = "Artist",
    model       = "gemini-2.0-flash",
    instruction = "Invent a creative image prompt. Then use the ImageGen tool to generate it.",
    tools       = [image_tool],
)

AgentTool wraps a sub-agent and makes it callable as a tool from a parent agent’s perspective. The artist_agent sees image_tool the same way it sees generate_image — as a tool it can call with arguments. The tool’s description is what the parent’s LLM reads to decide when to invoke it.

Why “agent as a tool” instead of sub_agents? With sub_agents, the coordinator’s LLM decides routing. With AgentTool, the parent agent explicitly invokes the sub-agent like a function call — passing specific arguments and receiving a return value. More precise, less autonomous.

Framework Comparison

Feature	CrewAI	Google ADK
Team definition	`Agent` + `Task` + `Crew`	`LlmAgent` / `BaseAgent` + orchestrators
Sequential flow	`Process.sequential` + `context=[]`	`SequentialAgent` + `output_key`
Parallel flow	Multiple tasks with no `context` dependency	`ParallelAgent`
Loop flow	Custom code / `while` loop	`LoopAgent` + `EventActions(escalate=True)`
Hierarchical	`allow_delegation=True`	`sub_agents=[]` on `LlmAgent`
Agent as tool	Not native (use LangChain tools)	`AgentTool` wrapper
State passing	`context=[task]` in task definition	`output_key` → `session.state`
Custom agent logic	Not supported natively	`BaseAgent._run_async_impl()`

At a Glance

WHAT

A system of specialized agents, each owning a distinct role and set of tools, coordinating to achieve a complex shared goal that no single agent could handle effectively alone.

WHY

Specialization beats generalism at scale. Decomposing a complex task into focused sub-tasks — each assigned to an agent built for it — produces higher quality, more reliable, and more scalable outcomes.

RULE OF THUMB

Use when a task requires diverse expertise, multiple distinct phases, or benefits from parallel workstreams. If a single agent's system prompt is trying to do too many jobs — split it into a team.

How Multi-Agent Communication Actually Works

In both CrewAI and ADK, agents don’t “talk to each other” directly — they communicate through shared data structures managed by the framework.

CrewAI: Task context as the communication channel. When you write context=[research_task] in a writing task definition, CrewAI stores the research task’s output after it completes. When the writing task runs, CrewAI injects that stored output into the writer agent’s context. The writer doesn’t know or care that a researcher ran before it — it just receives text in its context that says “here is the research that was done.” This is called indirect communication — agents share information through a coordinator (CrewAI) rather than directly messaging each other.

ADK: Session state as the communication channel. In ADK, agents communicate through session.state — a shared dictionary that all agents in a session can read from and write to. When researcher_agent_1 finishes, its output is stored as session.state["renewable_energy_result"] (because output_key="renewable_energy_result" was set). When synthesis_agent runs, its instruction template includes {renewable_energy_result} — ADK automatically fills this from session state.

Why indirect communication is better than direct messaging. Direct agent-to-agent messaging introduces tight coupling — agent A needs to know about agent B’s interface, location, and availability. If B changes or fails, A breaks. Indirect communication through shared state decouples them — A writes to a named key, B reads from a named key, neither knows about the other. You can swap out A or B independently without changing the other. You can add a new agent C that also reads from the same key. You can replay any agent’s behavior by pre-populating the key manually for debugging.

How the orchestrator decides which agent to call. In ADK’s hierarchical mode, the coordinator’s LLM reads every sub-agent’s description field and decides which one to invoke based on the user’s request. This is the same mechanism as tool routing: the sub-agent’s description is the “docstring” that the coordinator LLM reads to make its routing decision. Write sub-agent descriptions with the same precision you’d use for tool docstrings.

Common Mistakes When Building Multi-Agent Systems

Mistake 1: Over-specialization. Creating 15 highly specialized agents when 3 would do. Each agent boundary adds communication overhead (API calls, context window usage, potential errors in handoffs). Start with fewer, broader agents and specialize only when a single agent clearly fails at a specific task.

Mistake 2: No shared vocabulary. If the Research Agent produces “EV market growth: 40% YoY” and the Writer Agent expects a structured JSON report, the handoff fails. Define the exact format each agent should produce and each agent should expect. Use expected_output in CrewAI tasks or explicit instructions in ADK agents.

Mistake 3: Parallel agents with conflicting writes. In ADK’s ParallelAgent, if two sub-agents write to the same output_key, the second one overwrites the first. Always give parallel agents unique output_key values. If you forget, you’ll lose data silently — no error is raised.

Mistake 4: Not handling agent failure gracefully. If one agent in a sequential pipeline fails (API timeout, invalid output, exception), the pipeline crashes. In production, wrap each agent call in error handling and decide: should the pipeline retry, skip the failed step, or abort entirely?

Mistake 5: Building multi-agent before single-agent. Don’t jump to multi-agent complexity until a single agent provably fails at the task. Multi-agent systems are harder to debug, more expensive to run, and introduce new failure modes. Always start simple and add agents only when you’ve demonstrated a clear need.

Key Takeaways

Specialization wins. A focused agent with a clear role outperforms a generalist agent trying to do everything. Split roles aggressively.
The six interaction models — Single, Network, Supervisor, Supervisor-as-Tool, Hierarchical, Custom — each trade off autonomy vs. control vs. complexity. Most production systems use Hierarchical or Supervisor.
CrewAI wires tasks, not agents. The context=[task] parameter is the key to sequential handoffs — it passes one agent’s output as context to the next without any manual wiring.
ADK provides explicit primitives: SequentialAgent for linear pipelines, ParallelAgent for concurrent work, LoopAgent for iterative workflows, sub_agents for hierarchy, AgentTool for invoking an agent like a function.
output_key is ADK’s state bus. Every agent in an ADK session shares session.state. Writing to a key with output_key and reading from it in another agent’s instruction is the standard inter-agent communication mechanism.
EventActions(escalate=True) is how a LoopAgent stops. Always provide a clear stopping condition — and max_iterations as a safety cap.
Failure isolation is a feature. When one agent in a multi-agent system fails, the others can continue. Compare this to a monolithic agent where one failed step can derail the entire task.

Next up — Chapter 8: Memory, where agents stop starting from scratch on every turn and begin building persistent context across conversations and sessions.

Why One Agent Isn’t Enough

What Multi-Agent Collaboration Is

The Six Interaction Models

The Five Collaboration Patterns

Sequential Handoff

Parallel Workstreams

Debate & Consensus

Hierarchical Delegation

Critic-Reviewer

Watch a Two-Agent Team Work

Use Cases Across Domains

Complex Research

Software Development

Financial Analysis

Customer Support

Supply Chain

Network Remediation

CrewAI: Researcher + Writer Team

Defining the Agents

Defining the Tasks (with dependency)

Assembling the Crew

CrewAI Data Flow

Google ADK: Four Orchestration Patterns

1. Hierarchical: Parent-Child Agents

2. LoopAgent: Iterative Workflows

3. SequentialAgent: Linear Pipelines

4. ParallelAgent: Concurrent Workers

5. Agent as a Tool

Framework Comparison

At a Glance

How Multi-Agent Communication Actually Works

Common Mistakes When Building Multi-Agent Systems

Key Takeaways

Enjoy Reading This Article?