ARTICLE · 18 MIN READ · JANUARY 29, 2026
Chapter 7: Multi-Agent Collaboration
One agent hits walls. A team of specialized agents doesn't. Multi-agent collaboration lets you decompose complex problems into coordinated workstreams — each agent doing what it does best.
Why One Agent Isn’t Enough
Specialization: Focusing on a narrow domain to become very good at it. A specialist cardiologist is better at heart problems than a general practitioner, even though the GP knows more topics overall. The same principle applies to AI agents.
Orchestrator: The agent that manages and coordinates other agents. It receives the high-level goal, decomposes it, assigns sub-tasks, and aggregates results. Think of it as the project manager in a team.
Sequential vs parallel: Sequential = one after another (like a queue at a counter). Parallel = simultaneously (like multiple checkout lanes open at once). Sequential is simpler to reason about; parallel is faster for independent tasks.
Why does more roles = worse LLM performance? When you stuff too many instructions into one system prompt ("be a researcher AND a writer AND a fact-checker"), the model splits its "attention" across all goals. It tends to do each poorly. Separate, focused system prompts produce better results for each role.
Every chapter up to now has been about making a single agent smarter — chain it, route it, run it in parallel, give it tools, make it reflect, make it plan. All of these improve a single agent’s performance.
But there’s a ceiling — and it’s a hard one.
The ceiling shows up in several distinct ways:
Knowledge breadth vs. depth. A single agent trying to be a domain expert in multiple fields simultaneously is like a company that puts one person in charge of engineering, legal, marketing, and finance all at once. Each role gets diluted attention, and the person’s expertise in any one domain never reaches the depth needed to excel.
System prompt dilution. When you write a system prompt that tries to make one agent do many things — “You are a researcher AND a writer AND a code reviewer AND a fact-checker” — you’re asking the model to activate multiple behavioral patterns simultaneously. These patterns often have conflicting heuristics. A researcher’s instinct is to gather more information; a writer’s instinct is to synthesize and move on. Held in tension within one agent, they produce mediocre results in both directions.
Context window saturation. A complex multi-domain task accumulates a lot of context: research findings, intermediate drafts, tool outputs, conversation history. A single agent accumulates all of this in one context window, which fills up fast. Multiple specialized agents each have their own context window focused only on their domain.
Error propagation. When a single agent makes a mistake early in a long task, that mistake often propagates through every subsequent step. In a multi-agent system, errors are isolated — a mistake by the Research Agent affects only the research phase, and the Writer Agent can still produce a good draft given the flawed research (though the final output will reflect the research quality).
The multi-agent solution addresses all four of these ceilings simultaneously: specialization deepens expertise, dedicated context windows stay focused, errors are isolated, and coordination mechanisms ensure coherent final outputs.
A single agent handling a complex research project needs to be: a domain expert, a search specialist, a statistical analyst, a fact-checker, and a polished writer — simultaneously. The more roles you pile onto one system prompt, the worse it performs at each of them.
The same problem that prompted humans to build teams applies to agents: specialization beats generalism at scale.
Multi-agent collaboration breaks a complex goal into sub-tasks, assigns each to an agent built specifically for that task, and coordinates their outputs into a unified result. The researcher does the research. The analyst runs the numbers. The writer assembles the draft. The reviewer catches errors. Each agent excels at its role. The system as a whole exceeds what any single agent could produce.
What Multi-Agent Collaboration Is
Multi-agent collaboration is a system design pattern where:
- A complex goal is decomposed into discrete sub-tasks
- Each sub-task is assigned to a specialized agent with the right tools, knowledge, or reasoning approach
- Agents coordinate through defined communication protocols — passing outputs, delegating tasks, sharing state
- A final output emerges from the coordinated work of the team
The critical ingredient is inter-agent communication: a standardized way for agents to exchange data, delegate work, and signal completion. Without it, you just have isolated agents running in parallel — not a team.
The Six Interaction Models
Agent teams can be structured in fundamentally different ways. The choice of structure changes autonomy, fault tolerance, complexity, and scalability.
The Five Collaboration Patterns
Within multi-agent systems, agents interact through five fundamental patterns:
Sequential Handoff
Agent A completes its task and passes its output directly to Agent B. Output of A becomes input to B. Clean pipeline with clear dependencies.
Research → Write → Edit → PublishParallel Workstreams
Multiple agents work on independent sub-tasks simultaneously. Results are merged by a synthesizer. Reduces total latency for independent work.
News + Weather + Stocks → ReportDebate & Consensus
Agents with different perspectives evaluate options and discuss. The group reaches a more informed decision than any single agent could.
Pro + Con agents → Moderator → VerdictHierarchical Delegation
A manager agent dynamically assigns sub-tasks to worker agents based on their tool access. Workers report back, manager synthesizes.
Coordinator → Specialist A / B / C → ResultCritic-Reviewer
A creator agent produces output. A critic agent evaluates it for quality, compliance, or correctness. Creator revises based on critique.
Generator → Critic → Revised OutputWatch a Two-Agent Team Work
A Researcher and a Writer collaborate on a blog post. Click Run Team to watch the sequential handoff:
Use Cases Across Domains
Complex Research
Searcher agent finds sources, summarizer agent digests them, trend-spotter identifies patterns, synthesizer writes the final report.
Software Development
Requirements analyst, code generator, test writer, and documentation agent collaborate — each handing off to the next in sequence.
Financial Analysis
Agents fetch stock data, analyze news sentiment, run technical analysis, and generate investment recommendations in parallel.
Customer Support
Front-line agent handles general queries. Specialist agents (billing, technical, escalation) are activated only when needed.
Supply Chain
Agents representing suppliers, manufacturers, and distributors collaborate to optimize inventory and logistics in real time.
Network Remediation
Multiple triage agents work concurrently to pinpoint failures. Each integrates with existing ML tooling while contributing to a coordinated remediation plan.
CrewAI: Researcher + Writer Team
CrewAI is designed for exactly this — defining a team of agents with distinct roles, assigning tasks, and orchestrating collaboration.
from crewai import Agent, Task, Crew, Process
from langchain_google_genai import ChatGoogleGenerativeAI
llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash")
Defining the Agents
researcher = Agent(
role = 'Senior Research Analyst',
goal = 'Find and summarize the latest trends in AI.',
backstory = "You are an experienced research analyst with a knack for identifying key trends and synthesizing information.",
verbose = True,
allow_delegation = False,
)
writer = Agent(
role = 'Technical Content Writer',
goal = 'Write a clear and engaging blog post based on research findings.',
backstory = "You are a skilled writer who can translate complex technical topics into accessible content.",
verbose = True,
allow_delegation = False,
)
Two agents, two roles, zero overlap. The researcher knows nothing about writing; the writer knows nothing about searching. This clean separation means each agent’s system prompt is focused — no role confusion, no competing instructions.
allow_delegation=False— prevents each specialist from spawning sub-agents on their own. In a controlled pipeline, you want the delegation logic to live in the Crew definition, not in individual agents.
Defining the Tasks (with dependency)
research_task = Task(
description = "Research the top 3 emerging AI trends in 2024-2025. Focus on practical applications and impact.",
expected_output = "A detailed summary of the top 3 AI trends with key points and sources.",
agent = researcher,
)
writing_task = Task(
description = "Write a 500-word blog post based on the research findings. Engaging, accessible to a general audience.",
expected_output = "A complete 500-word blog post about the latest AI trends.",
agent = writer,
context = [research_task], # ← explicit dependency
)
context=[research_task]is the sequential handoff mechanism. It tells CrewAI: “the writing task cannot begin until research_task is complete, and the research output should be available to the writer as context.”Without this, the writer would run without access to the researcher’s findings — generating generic content rather than building on real research.
Assembling the Crew
blog_creation_crew = Crew(
agents = [researcher, writer],
tasks = [research_task, writing_task],
process = Process.sequential,
llm = llm,
verbose = 2,
)
result = blog_creation_crew.kickoff()
Process.sequentialensures tasks run in the order they’re listed — researcher first, writer second. Thecontextdependency enforces this even if the process allows parallelism.
llmat the Crew level sets a default model for all agents. Individual agents can still override with their ownllm=parameter if specialization is needed — e.g., a more powerful model for the researcher, a faster one for the writer.
CrewAI Data Flow
Google ADK: Four Orchestration Patterns
The ADK provides explicit primitives for each collaboration structure. Here are the four most important.
1. Hierarchical: Parent-Child Agents
from google.adk.agents import LlmAgent, BaseAgent
from google.adk.agents.invocation_context import InvocationContext
from google.adk.events import Event
from typing import AsyncGenerator
class TaskExecutor(BaseAgent):
"""A specialized agent with custom, non-LLM behavior."""
name: str = "TaskExecutor"
description: str = "Executes a predefined task."
async def _run_async_impl(
self, context: InvocationContext
) -> AsyncGenerator[Event, None]:
yield Event(author=self.name, content="Task finished successfully.")
BaseAgentis the extension point for non-LLM agents. When you need deterministic logic, API calls, or custom execution — instead of LLM reasoning — subclassBaseAgentand implement_run_async_impl. It yieldsEventobjects as output, integrating cleanly with the ADK event stream.
greeter = LlmAgent(name="Greeter", model="gemini-2.0-flash-exp", instruction="You are a friendly greeter.")
task_doer = TaskExecutor()
coordinator = LlmAgent(
name = "Coordinator",
model = "gemini-2.0-flash-exp",
instruction = "When asked to greet, delegate to the Greeter. When asked to perform a task, delegate to the TaskExecutor.",
sub_agents = [greeter, task_doer],
)
# ADK sets parent_agent automatically
assert greeter.parent_agent == coordinator
assert task_doer.parent_agent == coordinator
sub_agentsestablishes the parent-child relationship. The coordinator’s LLM reads both agents’ descriptions and decides which to delegate to. Theparent_agentattribute is set automatically by the ADK framework — you don’t need to wire it manually.
2. LoopAgent: Iterative Workflows
from google.adk.agents import LoopAgent, LlmAgent, BaseAgent
from google.adk.events import Event, EventActions
class ConditionChecker(BaseAgent):
"""Stops the loop when session state signals completion."""
name: str = "ConditionChecker"
description: str = "Checks if a process is complete and signals the loop to stop."
async def _run_async_impl(self, context: InvocationContext) -> AsyncGenerator[Event, None]:
status = context.session.state.get("status", "pending")
is_done = (status == "completed")
if is_done:
yield Event(author=self.name, actions=EventActions(escalate=True)) # stop loop
else:
yield Event(author=self.name, content="Condition not met, continuing loop.")
EventActions(escalate=True)is the loop-termination signal. WhenConditionCheckeryields this, theLoopAgentstops regardless of remaining iterations. Without it, the loop runs tomax_iterations.
context.session.stateis the shared key-value store accessible to all agents within a session.ProcessingStepwrites"status": "completed"there;ConditionCheckerreads it. This is how agents communicate state in ADK without direct message passing.
process_step = LlmAgent(
name = "ProcessingStep",
model = "gemini-2.0-flash-exp",
instruction = "Perform your task. If you are the final step, set session state 'status' to 'completed'.",
)
poller = LoopAgent(
name = "StatusPoller",
max_iterations = 10,
sub_agents = [process_step, ConditionChecker()],
)
LoopAgentruns itssub_agentssequentially on each iteration. Each iteration:ProcessingStepexecutes →ConditionCheckerevaluates → if not done, repeat. Maximum 10 iterations as a safety cap. Use this for polling, retry workflows, and iterative refinement loops.
3. SequentialAgent: Linear Pipelines
from google.adk.agents import SequentialAgent, Agent
step1 = Agent(name="Step1_Fetch", output_key="data")
step2 = Agent(name="Step2_Process", instruction="Analyze the information in state['data'] and provide a summary.")
pipeline = SequentialAgent(name="MyPipeline", sub_agents=[step1, step2])
output_key="data"— whenStep1_Fetchcompletes, its output is automatically stored insession.state["data"].Step2_Processreads it from state via its instruction. This is the ADK equivalent of CrewAI’scontext=[task]— explicit state-passing between sequential steps.
4. ParallelAgent: Concurrent Workers
from google.adk.agents import Agent, ParallelAgent
weather_fetcher = Agent(
name = "weather_fetcher",
model = "gemini-2.0-flash-exp",
instruction = "Fetch the weather for the given location and return only the report.",
output_key = "weather_data", # → session.state["weather_data"]
)
news_fetcher = Agent(
name = "news_fetcher",
model = "gemini-2.0-flash-exp",
instruction = "Fetch the top news story for the given topic and return only that story.",
output_key = "news_data", # → session.state["news_data"]
)
data_gatherer = ParallelAgent(
name = "data_gatherer",
sub_agents = [weather_fetcher, news_fetcher],
)
ParallelAgentfires both sub-agents concurrently. Each writes to a differentoutput_keyin session state —weather_dataandnews_data— with no conflict since the keys are distinct. After both complete, a downstream agent (or your application code) can read both from state.
5. Agent as a Tool
from google.adk.agents import LlmAgent
from google.adk.tools import agent_tool
def generate_image(prompt: str) -> dict:
"""Generates an image from a textual prompt. Returns image bytes."""
mock_bytes = b"mock_image_data_for_a_cat_wearing_a_hat"
return {"status": "success", "image_bytes": mock_bytes, "mime_type": "image/png"}
image_generator_agent = LlmAgent(
name = "ImageGen",
model = "gemini-2.0-flash",
description = "Generates an image based on a detailed text prompt.",
instruction = "Take the user's request and use the generate_image tool to create the image.",
tools = [generate_image],
)
image_tool = agent_tool.AgentTool(
agent = image_generator_agent,
description = "Use this to generate an image. Input should be a descriptive prompt.",
)
artist_agent = LlmAgent(
name = "Artist",
model = "gemini-2.0-flash",
instruction = "Invent a creative image prompt. Then use the ImageGen tool to generate it.",
tools = [image_tool],
)
AgentToolwraps a sub-agent and makes it callable as a tool from a parent agent’s perspective. Theartist_agentseesimage_toolthe same way it seesgenerate_image— as a tool it can call with arguments. The tool’sdescriptionis what the parent’s LLM reads to decide when to invoke it.Why “agent as a tool” instead of
sub_agents? Withsub_agents, the coordinator’s LLM decides routing. WithAgentTool, the parent agent explicitly invokes the sub-agent like a function call — passing specific arguments and receiving a return value. More precise, less autonomous.
Framework Comparison
| Feature | CrewAI | Google ADK |
|---|---|---|
| Team definition | Agent + Task + Crew | LlmAgent / BaseAgent + orchestrators |
| Sequential flow | Process.sequential + context=[] | SequentialAgent + output_key |
| Parallel flow | Multiple tasks with no context dependency | ParallelAgent |
| Loop flow | Custom code / while loop | LoopAgent + EventActions(escalate=True) |
| Hierarchical | allow_delegation=True | sub_agents=[] on LlmAgent |
| Agent as tool | Not native (use LangChain tools) | AgentTool wrapper |
| State passing | context=[task] in task definition | output_key → session.state |
| Custom agent logic | Not supported natively | BaseAgent._run_async_impl() |
At a Glance
A system of specialized agents, each owning a distinct role and set of tools, coordinating to achieve a complex shared goal that no single agent could handle effectively alone.
Specialization beats generalism at scale. Decomposing a complex task into focused sub-tasks — each assigned to an agent built for it — produces higher quality, more reliable, and more scalable outcomes.
Use when a task requires diverse expertise, multiple distinct phases, or benefits from parallel workstreams. If a single agent's system prompt is trying to do too many jobs — split it into a team.
How Multi-Agent Communication Actually Works
In both CrewAI and ADK, agents don’t “talk to each other” directly — they communicate through shared data structures managed by the framework.
CrewAI: Task context as the communication channel. When you write context=[research_task] in a writing task definition, CrewAI stores the research task’s output after it completes. When the writing task runs, CrewAI injects that stored output into the writer agent’s context. The writer doesn’t know or care that a researcher ran before it — it just receives text in its context that says “here is the research that was done.” This is called indirect communication — agents share information through a coordinator (CrewAI) rather than directly messaging each other.
ADK: Session state as the communication channel. In ADK, agents communicate through session.state — a shared dictionary that all agents in a session can read from and write to. When researcher_agent_1 finishes, its output is stored as session.state["renewable_energy_result"] (because output_key="renewable_energy_result" was set). When synthesis_agent runs, its instruction template includes {renewable_energy_result} — ADK automatically fills this from session state.
Why indirect communication is better than direct messaging. Direct agent-to-agent messaging introduces tight coupling — agent A needs to know about agent B’s interface, location, and availability. If B changes or fails, A breaks. Indirect communication through shared state decouples them — A writes to a named key, B reads from a named key, neither knows about the other. You can swap out A or B independently without changing the other. You can add a new agent C that also reads from the same key. You can replay any agent’s behavior by pre-populating the key manually for debugging.
How the orchestrator decides which agent to call. In ADK’s hierarchical mode, the coordinator’s LLM reads every sub-agent’s description field and decides which one to invoke based on the user’s request. This is the same mechanism as tool routing: the sub-agent’s description is the “docstring” that the coordinator LLM reads to make its routing decision. Write sub-agent descriptions with the same precision you’d use for tool docstrings.
Common Mistakes When Building Multi-Agent Systems
Mistake 1: Over-specialization. Creating 15 highly specialized agents when 3 would do. Each agent boundary adds communication overhead (API calls, context window usage, potential errors in handoffs). Start with fewer, broader agents and specialize only when a single agent clearly fails at a specific task.
Mistake 2: No shared vocabulary. If the Research Agent produces “EV market growth: 40% YoY” and the Writer Agent expects a structured JSON report, the handoff fails. Define the exact format each agent should produce and each agent should expect. Use expected_output in CrewAI tasks or explicit instructions in ADK agents.
Mistake 3: Parallel agents with conflicting writes. In ADK’s ParallelAgent, if two sub-agents write to the same output_key, the second one overwrites the first. Always give parallel agents unique output_key values. If you forget, you’ll lose data silently — no error is raised.
Mistake 4: Not handling agent failure gracefully. If one agent in a sequential pipeline fails (API timeout, invalid output, exception), the pipeline crashes. In production, wrap each agent call in error handling and decide: should the pipeline retry, skip the failed step, or abort entirely?
Mistake 5: Building multi-agent before single-agent. Don’t jump to multi-agent complexity until a single agent provably fails at the task. Multi-agent systems are harder to debug, more expensive to run, and introduce new failure modes. Always start simple and add agents only when you’ve demonstrated a clear need.
Key Takeaways
- Specialization wins. A focused agent with a clear role outperforms a generalist agent trying to do everything. Split roles aggressively.
- The six interaction models — Single, Network, Supervisor, Supervisor-as-Tool, Hierarchical, Custom — each trade off autonomy vs. control vs. complexity. Most production systems use Hierarchical or Supervisor.
- CrewAI wires tasks, not agents. The
context=[task]parameter is the key to sequential handoffs — it passes one agent’s output as context to the next without any manual wiring. - ADK provides explicit primitives:
SequentialAgentfor linear pipelines,ParallelAgentfor concurrent work,LoopAgentfor iterative workflows,sub_agentsfor hierarchy,AgentToolfor invoking an agent like a function. -
output_keyis ADK’s state bus. Every agent in an ADK session sharessession.state. Writing to a key withoutput_keyand reading from it in another agent’s instruction is the standard inter-agent communication mechanism. -
EventActions(escalate=True)is how aLoopAgentstops. Always provide a clear stopping condition — andmax_iterationsas a safety cap. - Failure isolation is a feature. When one agent in a multi-agent system fails, the others can continue. Compare this to a monolithic agent where one failed step can derail the entire task.
Next up — Chapter 8: Memory, where agents stop starting from scratch on every turn and begin building persistent context across conversations and sessions.
Enjoy Reading This Article?
Here are some more articles you might like to read next: