8 Google ADK Multi-Agent Patterns That Overcome the Limitations of a Single LLM Agent
Everyone has likely experienced feeling lost while trying to solve everything with a single AI agent. I also initially crammed a massive number of prompts into a single LLM agent, only to hit a context limit and experience inconsistent response quality. Eventually, realizing "this is a structural issue," I began to delve deeply into multi-agent architecture.
Google has systematized 8 Multi-Agent Patterns for designing complex AI workflows through the Agent Development Kit (ADK). It is not simply about "using multiple agents," but a blueprint for an AI microservices architecture that clearly divides and combines roles. Since ADK Python code examples are provided for each pattern, you can establish criteria for deciding which pattern to choose in different situations and take away code to experiment with yourself.
The code is entirely Python-based. This is because the ADK is provided in Python (and JS); however, if you have a Java or Go background, you can still fully utilize the concepts of patterns and trade-offs by reading them from an architectural perspective. It is written with a depth that allows developers who already have experience using an LLM to follow along immediately.
Key Concepts
Three Basic Execution Models
Before diving into the 8 patterns, if you first go over ADK's three basic execution models, the rest will come much more naturally.
| Execution Model | ADK Class | Features |
|---|---|---|
| Sequential Execution | SequentialAgent |
A → B → C, the next one inherits the previous result |
| Repeat execution | LoopAgent |
Repeat child agents until termination condition |
| Parallel Execution | ParallelAgent |
Run multiple agents simultaneously |
If these three are Lego blocks, the eight patterns are structures that can be made with those blocks.
from google.adk.agents import SequentialAgent, ParallelAgent, LoopAgent, LlmAgent
data_collector = LlmAgent(name="DataCollector", model="gemini-2.0-flash")
analyzer = LlmAgent(name="Analyzer", model="gemini-2.0-flash")
reporter = LlmAgent(name="Reporter", model="gemini-2.0-flash")
pipeline = SequentialAgent(
name="DataPipeline",
sub_agents=[data_collector, analyzer, reporter]
)Full Map of 8 Patterns
| # | Pattern Name | Core Structure | When to Use |
|---|---|---|---|
| 1 | Sequential Pipeline | A → B → C Sequential Execution | Deterministic Data Transformation Flow |
| 2 | Coordinator / Dispatcher | Central LLM routes to specialist agents | Processing various input types |
| 3 | Parallel Fan-Out / Gather | Synthesizer aggregates after concurrent execution | Independent parallel tasks |
| 4 | Hierarchical Decomposition | Breaking down a higher-level goal into lower-level tasks | Super-large complex tasks |
| 5 | Generator-Critic | Generation → Review → Iteration | Deliverables where quality assurance is key |
| 6 | Human-in-the-Loop | Waiting for Human Approval for High-Risk Tasks | Irreversible Tasks, Compliance |
| 7 | Loop Agent | Repeat until termination condition is met | State-based iterative task |
| 8 | Custom / Agentic Workflow | LLM makes dynamic routing decisions at runtime | Unpredictable and complex flow |
AI Microservices Architecture: A structure where each agent has a single responsibility and can be independently tested and replaced. It applies the existing MSA philosophy to the AI agent layer.
Reading while being conscious of the relationships between patterns develops a sense of choice much faster. In particular, examine the differences between similar patterns, such as numbers 5 and 7.
Pattern 1: Sequential Pipeline
This is the most intuitive pattern. It is a pipeline structure where the output of the previous agent becomes the input for the next, and honestly, it is the pattern I reach for first. It shines when the sequence is clear, such as data collection → analysis → report generation.
from google.adk.agents import SequentialAgent, LlmAgent
research_agent = LlmAgent(
name="Researcher",
model="gemini-2.0-flash",
instruction="주어진 주제에 대한 핵심 데이터를 수집하세요."
)
summary_agent = LlmAgent(
name="Summarizer",
model="gemini-2.0-flash",
instruction="이전 에이전트의 조사 결과를 3문단으로 요약하세요."
)
fact_check_agent = LlmAgent(
name="FactChecker",
model="gemini-2.0-flash",
instruction="요약된 내용에서 오류나 불확실한 주장을 검토하세요."
)
pipeline = SequentialAgent(
name="ResearchPipeline",
sub_agents=[research_agent, summary_agent, fact_check_agent]
)One pitfall to point out is that if the preceding agent fails, the entire subsequent agent streamlines meaningless results. Caution is required, as simply passing through each stage of the pipeline without validation logic can lead to a silent accumulation of incorrect results.
Pattern 2: Coordinator / Dispatcher
This is a pattern where a central LLM analyzes user intent and routes it to a specialized agent. You can think of it as a situation where a customer support system determines whether the request is for a payment inquiry or technical support.
from google.adk.agents import LlmAgent
billing_agent = LlmAgent(
name="BillingSpecialist",
model="gemini-2.0-flash",
instruction="결제, 환불, 구독 관련 문의를 처리합니다."
)
tech_agent = LlmAgent(
name="TechSupport",
model="gemini-2.0-flash",
instruction="기술적 오류, 설치, 연동 관련 문의를 처리합니다."
)
coordinator = LlmAgent(
name="Coordinator",
model="gemini-2.0-flash",
instruction="""사용자 문의를 분석하고 적절한 전문가에게 라우팅하세요.
- 결제/환불/구독 → BillingSpecialist
- 기술 오류/설치/연동 → TechSupport""",
sub_agents=[billing_agent, tech_agent]
)Coordinator's Routing Mechanism: Upon receiving sub_agents, LlmAgent automatically registers "agent transfer" tools that allow the ADK to switch to each sub-agent. The LLM analyzes user input and makes routing decisions by invoking one of these tools in the form of a tool call. Since branching conditions are based on the LLM's judgment rather than the code, it can flexibly respond to input types that are not predefined.
At first, I also wondered, "LLM handles routing automatically?" but when you actually use it, it is surprisingly accurate. However, if you do not clearly specify the role boundaries of each specialist agent in the instructions, there are cases where it sends data to the wrong place.
Pattern 3: Parallel Fan-Out / Gather
It is a pattern that processes independent tasks simultaneously and combines the results into one. It can significantly reduce latency when querying multiple data sources simultaneously or translating into multiple languages at the same time.
from google.adk.agents import SequentialAgent, ParallelAgent, LlmAgent
doc_search = LlmAgent(
name="DocSearch",
model="gemini-2.0-flash",
instruction="공식 문서에서 관련 내용을 검색합니다."
)
history_search = LlmAgent(
name="HistorySearch",
model="gemini-2.0-flash",
instruction="사용자 이력에서 관련 케이스를 검색합니다."
)
parallel_search = ParallelAgent(
name="ParallelSearch",
sub_agents=[doc_search, history_search]
)
synthesizer = LlmAgent(
name="Synthesizer",
model="gemini-2.0-flash",
instruction="병렬 검색 결과를 통합해 최적의 답변을 생성합니다."
)
# ParallelAgent 결과를 Synthesizer로 흘려보내는 순차 래퍼가 필수
search_and_synthesize = SequentialAgent(
name="SearchAndSynthesize",
sub_agents=[parallel_search, synthesizer]
)There is a point to note. If you define only ParallelAgent and do not append synthesizer, the parallel search results will not be aggregated. You must wrap it with SequentialAgent so that the results flow to the next agent. If you run it without this, synthesizer will operate on its own without any input.
패턴 4: Hierarchical Decomposition
This is a pattern where a parent agent breaks down complex tasks that cannot fit into a single context window into sub-tasks and delegates them. It is useful when handling large-scale goals, such as designing an entire software project or analyzing legal documents spanning dozens of pages.
from google.adk.agents import LlmAgent
frontend_agent = LlmAgent(
name="FrontendArchitect",
model="gemini-2.0-flash",
instruction="프론트엔드 아키텍처와 UI 컴포넌트 설계를 담당합니다."
)
backend_agent = LlmAgent(
name="BackendArchitect",
model="gemini-2.0-flash",
instruction="백엔드 API 설계와 데이터베이스 스키마를 담당합니다."
)
infra_agent = LlmAgent(
name="InfraEngineer",
model="gemini-2.0-flash",
instruction="배포 인프라와 CI/CD 파이프라인 설계를 담당합니다."
)
project_manager = LlmAgent(
name="ProjectManager",
model="gemini-2.0-flash",
instruction="""프로젝트 요구사항을 분석하고 아키텍처 설계 작업을 분배합니다.
각 전문가에게 적절한 하위 작업을 위임하고 결과를 통합하세요.
- UI/UX 관련 → FrontendArchitect
- API/DB 관련 → BackendArchitect
- 인프라/배포 관련 → InfraEngineer""",
sub_agents=[frontend_agent, backend_agent, infra_agent]
)If the decomposition criteria for the parent agent are not clearly established when first introducing this pattern, it can easily degenerate into a pattern where the "manager agent makes all decisions directly." This can lead to situations where child agents merely echo the message without playing any substantial role. The key is to explicitly define the authority scope and output format of each child agent in the instructions.
패턴 5: Generator-Critic (Iterative Refinement)
It is used for deliverables where quality is critical. The structure involves a Generator creating a draft, a Critic reviewing it, and repeating the process until it passes the criteria; this works exceptionally well for code reviews and documentation quality assurance in real-world scenarios.
from google.adk.agents import LoopAgent, LlmAgent
generator = LlmAgent(
name="Generator",
model="gemini-2.0-flash",
instruction="기술 문서 초안을 작성합니다. 이전 피드백이 있다면 반영하세요."
)
critic = LlmAgent(
name="Critic",
model="gemini-2.0-flash",
instruction="""문서를 검토하고 피드백을 제공하세요.
모든 기준을 충족하면 응답에 'APPROVED'를 포함하고 escalate 플래그를 설정하세요.
기준: 정확성, 명확성, 완전성"""
)
refinement_loop = LoopAgent(
name="RefinementLoop",
sub_agents=[generator, critic],
max_iterations=5
)Loop Termination Mechanism: ADK's LoopAgent terminates the loop when the subordinate agent sets escalate: true in the session state. In actual implementations, instructions are configured to set this flag along with the Critic agent output 'APPROVED', or the output is parsed with after_agent_callback to set the flag directly. max_iterations is a safety net for extreme situations where the Critic absolutely does not approve.
Pattern 6: Human-in-the-Loop
It is a pattern of temporarily pausing automation for irreversible or high-risk tasks to obtain human approval. It acts as a safety net in situations such as executing financial transactions, deleting large volumes of data, and sending messages to external systems.
from google.adk.agents import LlmAgent
from google.adk.tools import FunctionTool
def request_human_approval(trade_details: dict) -> dict:
"""
개념 설명용 단순화 버전.
실제 구현에서는 아래 흐름처럼 외부 채널을 활용합니다:
[실무 패턴 pseudocode]
approval_id = generate_uuid()
slack_client.chat_postMessage(
channel="#trade-approvals",
blocks=[
{"type": "section", "text": f"거래 승인 요청: {trade_details}"},
{"type": "actions", "elements": [
{"type": "button", "text": "승인", "value": f"approve:{approval_id}"},
{"type": "button", "text": "거부", "value": f"reject:{approval_id}"}
]}
]
)
result = approval_store.wait_for_response(approval_id, timeout_sec=3600)
return {"approved": result.approved, "approver": result.user}
"""
print(f"[승인 요청] 거래 내용: {trade_details}")
approval = input("승인하시겠습니까? (yes/no): ")
return {"approved": approval.lower() == "yes", "details": trade_details}
approval_tool = FunctionTool(func=request_human_approval)
approval_agent = LlmAgent(
name="ApprovalGateway",
model="gemini-2.0-flash",
instruction="거래 추천안을 사용자에게 제시하고 승인을 요청합니다. 미승인 시 처리를 중단합니다.",
tools=[approval_tool]
)If approval criteria are not clearly established when this pattern is first introduced, it eventually becomes a button that no one checks. It is important to explicitly define trigger thresholds from the start, such as "when the amount exceeds a certain level" or "when it affects a specific account."
Pattern 7: Loop Agent
It is a pattern that repeatedly executes a set of sub-agents until a termination condition is met. While Pattern 5 (Generator-Critic) is a special application case of LoopAgent, the Loop Agent pattern itself is a more general-purpose structure.
from google.adk.agents import LoopAgent, LlmAgent
# 웹 크롤링 시나리오: 수집 완료까지 반복
data_fetcher = LlmAgent(
name="DataFetcher",
model="gemini-2.0-flash",
instruction="""다음 페이지의 데이터를 수집하세요.
더 이상 수집할 데이터가 없으면 세션 상태에 escalate 플래그를 설정하세요."""
)
data_processor = LlmAgent(
name="DataProcessor",
model="gemini-2.0-flash",
instruction="수집된 데이터를 정제하고 저장 형식으로 변환합니다."
)
crawl_loop = LoopAgent(
name="CrawlLoop",
sub_agents=[data_fetcher, data_processor],
max_iterations=100
)To clearly distinguish it from Generator-Critic, Generator-Critic is a narrow pattern that iterates toward a single goal: "passing quality standards." Loop Agent is a broader concept applicable to any iterative task with "state-based termination conditions," and crawling, large-scale batch processing, and retry logic fall within the scope of Loop Agent. While the two patterns may appear redundant, they actually differ in their scope of application.
Pattern 8: Custom / Agentic Workflow
This is the most flexible pattern where the LLM directly determines routing and tool combinations at runtime. While the previous seven are predefined structures, this pattern allows the LLM to configure the workflow itself.
from google.adk.agents import LlmAgent
from google.adk.tools import FunctionTool
def search_documentation(query: str) -> str:
return f"'{query}'에 대한 문서 검색 결과..."
def run_code_analysis(code: str) -> str:
return f"코드 분석 결과..."
def create_ticket(title: str, description: str) -> str:
return f"티켓 생성 완료: {title}"
# LLM이 런타임에 도구 조합과 순서를 스스로 결정
agentic_agent = LlmAgent(
name="AgenticWorkflow",
model="gemini-2.0-flash",
instruction="""사용자 요청을 분석하고 필요한 도구를 조합해서 처리하세요.
문제를 진단하고, 해결책을 찾고, 필요하면 티켓을 생성하는 등
상황에 따라 판단해서 진행하면 됩니다.""",
tools=[
FunctionTool(func=search_documentation),
FunctionTool(func=run_code_analysis),
FunctionTool(func=create_ticket)
]
)To be honest, I initially assumed that LLM would handle everything with this pattern, only to experience debugging hell. It is much more difficult than other patterns to trace which tools were called, why, and in what order. I recommend applying this pattern restrictively only to complex, completely unpredictable flows, and if possible, reviewing the previous structured patterns first.
Practical Application
Example 1: Customer Support System — Pattern Combination
In reality, patterns are used in combination much more often than individually. If you were to build an actual customer support system, you could combine these four patterns in this way.
from google.adk.agents import SequentialAgent, ParallelAgent, LoopAgent, LlmAgent
# 병렬 검색 (Pattern 3: Parallel Fan-Out)
doc_searcher = LlmAgent(name="DocSearcher", model="gemini-2.0-flash",
instruction="공식 문서에서 해결책을 검색합니다.")
history_searcher = LlmAgent(name="HistorySearcher", model="gemini-2.0-flash",
instruction="유사 사례 이력을 검색합니다.")
parallel_research = ParallelAgent(
name="ParallelResearch",
sub_agents=[doc_searcher, history_searcher]
)
# 품질 보증 루프 (Pattern 5: Generator-Critic)
answer_generator = LlmAgent(name="AnswerGenerator", model="gemini-2.0-flash",
instruction="검색 결과를 바탕으로 고객 답변을 생성합니다.")
tone_critic = LlmAgent(name="ToneCritic", model="gemini-2.0-flash",
instruction="""답변의 톤과 정확성을 검토합니다.
기준 통과 시 'APPROVED', 미통과 시 구체적 피드백을 제공합니다.""")
quality_loop = LoopAgent(
name="QualityLoop",
sub_agents=[answer_generator, tone_critic],
max_iterations=3
)
# 기술 지원 전체 파이프라인 (Pattern 1: Sequential)
tech_support_pipeline = SequentialAgent(
name="TechSupportPipeline",
sub_agents=[parallel_research, quality_loop]
)
# 최상위 라우터 (Pattern 2: Coordinator)
billing_specialist = LlmAgent(name="BillingSpecialist", model="gemini-2.0-flash",
instruction="결제 및 구독 관련 문의를 전담합니다.")
support_coordinator = LlmAgent(
name="SupportCoordinator",
model="gemini-2.0-flash",
instruction="""고객 문의 유형을 분석해 적절한 팀으로 라우팅합니다.
- 결제/환불/청구 → BillingSpecialist
- 기술 문제/오류/사용법 → TechSupportPipeline""",
sub_agents=[billing_specialist, tech_support_pipeline]
)| Components | Usage Patterns | Roles |
|---|---|---|
SupportCoordinator |
Coordinator | Routing after intent analysis |
ParallelResearch |
Parallel Fan-Out | Simultaneous Document and History Search |
QualityLoop |
Generator-Critic | Answer Quality Assurance |
TechSupportPipeline |
Sequential | Technical Support Flow Coordination |
Example 2: Financial Transaction Processing — Including Human-in-the-Loop
Human-in-the-Loop is essential for irreversible tasks, such as actual trade execution. In ADK, you can insert a human approval step in the middle of the pipeline using a callback function.
from google.adk.agents import SequentialAgent, LlmAgent
from google.adk.tools import FunctionTool
def request_human_approval(trade_details: dict) -> dict:
# 실제 구현: Slack 웹훅이나 이메일로 승인 요청 후 비동기 폴링
print(f"[승인 요청] 거래 내용: {trade_details}")
approval = input("승인하시겠습니까? (yes/no): ")
return {"approved": approval.lower() == "yes", "details": trade_details}
approval_tool = FunctionTool(func=request_human_approval)
data_agent = LlmAgent(name="MarketDataAgent", model="gemini-2.0-flash",
instruction="실시간 시장 데이터와 과거 데이터를 수집합니다.")
analysis_agent = LlmAgent(name="AnalysisAgent", model="gemini-2.0-flash",
instruction="포트폴리오와 시장 데이터를 분석합니다.")
recommendation_agent = LlmAgent(name="RecommendationAgent", model="gemini-2.0-flash",
instruction="분석 결과를 바탕으로 거래 추천안을 생성합니다.")
approval_agent = LlmAgent(
name="ApprovalGateway",
model="gemini-2.0-flash",
instruction="거래 추천안을 사용자에게 제시하고 승인을 요청합니다.",
tools=[approval_tool]
)
execution_agent = LlmAgent(name="TradeExecutor", model="gemini-2.0-flash",
instruction="승인된 거래만 실행합니다. 미승인 시 처리를 중단합니다.")
trading_pipeline = SequentialAgent(
name="TradingPipeline",
sub_agents=[data_agent, analysis_agent, recommendation_agent,
approval_agent, execution_agent]
)Pros and Cons Analysis
Advantages
| Item | Content |
|---|---|
| Specialization and Accuracy | Agents with separated roles focus on each domain, delivering higher accuracy than a single agent |
| Modularity | Independent testing is possible at the agent level, and fault isolation is easy. |
| Parallel Scalability | You can dramatically reduce the latency of independent tasks with the Parallel pattern |
| Adaptability | Coordinator patterns flexibly respond to unexpected input types |
| Handling Extra-Large Tasks | Decomposes and processes tasks that exceed a single context window using the Hierarchical pattern |
Disadvantages and Precautions
| Item | Content | Response Plan |
|---|---|---|
| Increased LLM Call Costs | As the number of agents increases, the number of API calls and costs also increase | Deploy lightweight models (Haiku, Flash) to sub-agents |
| Difficulty attributing the cause | Difficult to track which agent the error occurred in | Configure distributed tracing with OpenTelemetry, Langfuse, etc. |
| Sequential Dependency Pitfalls | Introducing multi-agents to tasks with strong sequential dependencies actually degrades performance | Review parallelization potential first, then select a pattern |
| Orchestration Complexity | As the number of agents increases, state management and error handling become more complex | Keep the number of agents to a minimum, and scale incrementally as necessary |
| Human-in-the-Loop Bottleneck | Steps requiring human approval partially sacrifice the speed benefits of automation | Minimize unnecessary intervention by clearly setting thresholds requiring approval |
A2A Protocol: Agent2Agent (A2A) is Google's open standard that enables agents from different vendors to communicate with each other. While Model Context Protocol (MCP) handles agent-tool communication, A2A is the agent-agent communication layer. It is supported by over 150 organizations, including Google, Microsoft, AWS, and Salesforce.
The Most Common Mistakes in Practice
-
Attaching a Coordinator to every task — Adding a Coordinator to a simple sequential flow increases LLM calls and actually slows it down. It is recommended to first check if branching is necessary.
-
Introducing Parallel to Tasks with High Sequential Dependencies — Google research evaluating 180 agent configurations found that multi-agent collaboration actually degraded performance in tasks with high sequential dependencies. We recommend checking first whether "these two tasks are truly independent."
-
Not setting a termination condition in the Generator-Critic loop — If the Critic never returns APPROVED, it becomes an infinite loop. You must set
max_iterations, and it is recommended to also consider graceful fallback logic when the maximum iteration is reached.
In Conclusion
Multi-agent patterns are not tools to eliminate complexity, but tools to structure it. The key is not to stack more patterns, but to select the right pattern that fits the nature of the task. In reality, an incremental approach—starting with the simplest and adding patterns only when bottlenecks arise or quality is lacking—works best.
Things you can start right now:
- After installing ADK with
pip install google-adk, you can run theSequentialAgentexample directly from the ADK Official Multi-Agent Documentation. Running the three basic execution models by hand will give you a completely different feel. - You can identify a "flow that is difficult to handle with a single LLM agent" in your current project and compare it with the 8 pattern tables above to select which pattern fits. You can attach OpenTelemetry or Langfuse to the agent chain to measure the number of calls, latency, and cost for each agent. Visually seeing the actual trade-offs created by design decisions provides the best basis for selecting the next pattern.
Reference Materials
- Developer's guide to multi-agent patterns in ADK | Google Developers Blog
- Google's Eight Essential Multi-Agent Design Patterns | InfoQ
- Multi-agent systems | ADK Official Documentation
- Choose a design pattern for your agentic AI system | Google Cloud Architecture Center
- Announcing the Agent2Agent Protocol (A2A) | Google Developers Blog
- Google Publishes Scaling Principles for Agentic Architectures | InfoQ
- Towards a science of scaling agent systems | Google Research Blog
- Google ADK vs LangGraph | ZenML Blog
- Four Design Patterns for Event-Driven, Multi-Agent Systems | Confluent