Building Multi-Agent Systems with MCP and A2A — A Practical Integration Guide to Model Context Protocol and Agent-to-Agent Protocol
When you first build an AI agent, you hit a wall: "I've attached the tools, but how do I get agents talking to each other?" I started out writing custom HTTP clients for each agent, defining message formats ad hoc, and inevitably ended up with spaghetti code. The thought I kept having was, "Is this really how everyone does it?" — fortunately, it isn't.
MCP (Model Context Protocol) is an open protocol released by Anthropic in November 2024 that standardizes how agents access external tools and data. A2A (Agent-to-Agent Protocol) is a protocol announced by Google in April 2025 that standardizes how agents built with different frameworks collaborate with each other. As of 2026, the "dual-stack" approach of using these two protocols together has become the de facto standard for production multi-agent architectures.
This article walks through what problems MCP and A2A each solve, and how they combine in layers in real systems — with code. By following the customer support automation and newsroom fact-checking scenarios, you'll be able to spin up an MCP server quickly with FastMCP and design a structure where two agents collaborate in real time via A2A.
Core Concepts
MCP: The USB-C for Agents and Tools
In one sentence, MCP is "a USB-C port for AI agents." Whether it's a file system, a database, or an external API — once you wrap it as an MCP server, any agent can access it through the same interface. At first I thought, "How is this different from a regular REST API?", but the key difference is that agents dynamically query and select the tool list at runtime. You don't need to hardcode tools.
The architecture is a simple client–server model.
Agent (MCP Client)
│
├─ MCP Server A (File System)
├─ MCP Server B (Database)
└─ MCP Server C (External API)There are three main things an MCP server provides:
- Tools: Functions the agent can invoke (read files, query a DB, etc.)
- Resources: Data the agent can reference (documents, config files, etc.)
- Prompts: Reusable prompt templates
With FastMCP, you can build an MCP server incredibly fast. A single @mcp.tool() decorator turns a Python function directly into an MCP tool.
# pip install fastmcp httpx
from fastmcp import FastMCP
import httpx
mcp = FastMCP("weather-server")
@mcp.tool()
async def get_weather(city: str) -> dict:
"""Returns current weather information for a given city name."""
async with httpx.AsyncClient() as client:
response = await client.get(
"https://api.weather.example.com/current",
params={"q": city, "units": "metric"}
)
data = response.json()
return {
"city": city,
"temperature": data["main"]["temp"],
"description": data["weather"][0]["description"]
}
@mcp.tool()
async def get_forecast(city: str, days: int = 5) -> list[dict]:
"""Returns the weather forecast for a city."""
async with httpx.AsyncClient() as client:
response = await client.get(
"https://api.weather.example.com/forecast",
params={"q": city, "cnt": days, "units": "metric"}
)
return response.json()["list"]
if __name__ == "__main__":
mcp.run()FastMCP: A lightweight Python framework for building MCP servers as simply as Flask. Install with
pip install fastmcpand use immediately viafrom fastmcp import FastMCP. It serves the same role asfrom mcp.server.fastmcp import FastMCPfrom the official Python SDK (pip install mcp) but is better suited for rapid prototyping.
The client side is implemented with the TypeScript SDK like this:
// npm install @modelcontextprotocol/sdk
import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { StdioClientTransport } from "@modelcontextprotocol/sdk/client/stdio.js";
const transport = new StdioClientTransport({
command: "python",
args: ["weather_server.py"]
});
const client = new Client({ name: "weather-agent", version: "1.0.0" });
await client.connect(transport);
// Dynamically query the available tool list — no hardcoding needed
const tools = await client.listTools();
console.log("Available tools:", tools.tools.map(t => t.name));
// Call a tool
const result = await client.callTool({
name: "get_weather",
arguments: { city: "Seoul" }
});
console.log("Weather info:", result.content);A2A: The Language Between Agents
If MCP defines the relationship between "agents and tools," A2A defines the relationship between "agents and agents." Delegating a task from a Python-built agent to a Java-built agent and receiving results — that's the problem A2A solves.
I also initially thought, "Can't I just call a matching REST API endpoint?" but once you have multiple agents, figuring out "can I delegate this task to that agent?" itself becomes the problem. A2A's Agent Card solves this discovery problem. Each agent advertises what it can do as a JSON document, and other agents read this card to find collaboration targets.
{
"name": "ResearchAgent",
"version": "1.0.0",
"description": "An agent responsible for web search and fact-checking",
"url": "https://research-agent.example.com",
"capabilities": {
"streaming": true,
"pushNotifications": true
},
"skills": [
{
"id": "fact-check",
"name": "Fact Check",
"description": "Verifies the factual accuracy of given information",
"inputModes": ["text"],
"outputModes": ["text", "data"]
},
{
"id": "web-search",
"name": "Web Search",
"description": "Searches the web for relevant information",
"inputModes": ["text"],
"outputModes": ["text", "data"]
}
],
"authentication": {
"schemes": ["bearer"]
}
}Each A2A agent publishes this card at the /.well-known/agent.json endpoint. Actual communication between agents is done via JSON-RPC 2.0 over HTTPS, and long-running tasks stream progress via SSE.
JSON-RPC 2.0: A lightweight protocol for calling remote functions over HTTP. You send a method name (
"method": "tasks/send") and parameters as JSON in the request, and the server responds with the result as JSON. If you've used REST APIs, you can pick this up without much difficulty.
The core A2A method is tasks/send (send a task); in practice, tasks/get (check status) and tasks/cancel (cancel) are also frequently used. For long-running tasks that involve external API calls in particular, periodically polling status with tasks/get is an essential pattern.
# pip install httpx
import httpx
import json
import uuid
from typing import AsyncIterator
async def delegate_task_to_agent(
agent_url: str,
task: str,
context_id: str | None = None
) -> AsyncIterator[dict]:
"""Delegates a task to another agent using the A2A protocol."""
# uuid4() is safer than hash() — hash() can return different values across runs
task_id = str(uuid.uuid4())
payload = {
"jsonrpc": "2.0",
"method": "tasks/send",
"params": {
"id": task_id,
"contextId": context_id,
"message": {
"role": "user",
"parts": [{"kind": "text", "text": task}]
}
},
"id": 1
}
# In production, try-except and timeout configuration are essential
async with httpx.AsyncClient(timeout=60.0) as client:
async with client.stream(
"POST",
f"{agent_url}/a2a",
json=payload,
headers={
"Accept": "text/event-stream",
"Content-Type": "application/json"
}
) as response:
async for line in response.aiter_lines():
if line.startswith("data: "):
event = json.loads(line[6:])
yield eventSSE (Server-Sent Events): An HTTP-based technology that enables one-way streaming from server to client. In A2A, it's used to deliver intermediate results of long-running tasks in real time. Unlike WebSockets, it's unidirectional, making implementation simpler, and it works directly on standard HTTP infrastructure.
The Relationship Between the Two Protocols: Vertical vs. Horizontal
Honestly, when I first saw them I thought "How are these different? Aren't they both for connecting agents?" — but it becomes clear when you think of it this way:
| Category | MCP | A2A |
|---|---|---|
| Connection target | Agent ↔ Tools/Data | Agent ↔ Agent |
| Communication direction | Vertical (tool access) | Horizontal (peer-to-peer) |
| Communication model | Client–Server | Peer-to-peer |
| Transport | stdio, HTTP+SSE | JSON-RPC 2.0 over HTTPS, SSE |
| Core purpose | Standardize access to external capabilities | Standardize collaboration between heterogeneous agents |
In real systems, these two protocols combine in layers. Each agent accesses its tools via MCP while delegating tasks to other agents via A2A.
Practical Application
Example 1: Customer Support Agent — Connecting Tools with MCP
Let's say you're building a customer support system. The agent needs to access a CRM and a knowledge base. By wrapping these two systems as MCP servers, you can reuse the servers as-is even if you swap out the agent later.
# crm_mcp_server.py
# pip install fastmcp
from fastmcp import FastMCP
mcp = FastMCP("crm-server")
# In a real project, replace the following with a CRM DB client
# (SQLite, PostgreSQL, or a CRM SaaS API, etc.)
class _MockDB:
async def find_customer(self, customer_id):
return None # Replace with a real implementation
async def vector_search(self, query, limit):
return []
async def create_ticket(self, **kwargs):
return type("Ticket", (), {"id": "TICKET-001"})()
db = _MockDB()
@mcp.tool()
async def get_customer_info(customer_id: str) -> dict:
"""Looks up customer information by customer ID."""
customer = await db.find_customer(customer_id)
if not customer:
return {"error": f"Customer {customer_id} not found"}
return {
"id": customer.id,
"name": customer.name,
"email": customer.email,
"subscription_tier": customer.tier,
"open_tickets": customer.open_ticket_count
}
@mcp.tool()
async def search_knowledge_base(query: str, top_k: int = 3) -> list[dict]:
"""Searches the knowledge base for relevant documents."""
results = await db.vector_search(query, limit=top_k)
return [
{
"title": doc.title,
"content": doc.content[:500],
"relevance_score": doc.score
}
for doc in results
]
@mcp.tool()
async def create_ticket(
customer_id: str,
subject: str,
description: str,
priority: str = "medium"
) -> dict:
"""Creates a new support ticket."""
ticket = await db.create_ticket(
customer_id=customer_id,
subject=subject,
description=description,
priority=priority
)
return {"ticket_id": ticket.id, "status": "created"}
if __name__ == "__main__":
mcp.run()Now let's build a customer support agent that uses this MCP server. The core of the agent loop is repeating tool calls until end_turn arrives. When I first implemented this without the loop — just making one-off calls — no tools executed at all, and I spent a long time debugging.
# support_agent.py
# pip install anthropic mcp
import anthropic
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
async def run_support_agent(customer_id: str, user_message: str):
server_params = StdioServerParameters(
command="python",
args=["crm_mcp_server.py"]
)
async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
await session.initialize()
# Dynamically query the tool list from the MCP server — no hardcoding needed
tools_result = await session.list_tools()
tools = [
{
"name": tool.name,
"description": tool.description,
"input_schema": tool.inputSchema
}
for tool in tools_result.tools
]
client = anthropic.Anthropic()
messages = [{"role": "user", "content": user_message}]
# Agent loop: repeat tool calls until end_turn
while True:
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
tools=tools,
messages=messages
)
if response.stop_reason == "end_turn":
text_content = next(
block for block in response.content
if block.type == "text"
)
return text_content.text
tool_uses = [
block for block in response.content
if block.type == "tool_use"
]
messages.append({
"role": "assistant",
"content": response.content
})
tool_results = []
for tool_use in tool_uses:
result = await session.call_tool(
tool_use.name,
arguments=tool_use.input
)
tool_results.append({
"type": "tool_result",
"tool_use_id": tool_use.id,
"content": str(result.content)
})
messages.append({
"role": "user",
"content": tool_results
})| Code point | Role |
|---|---|
stdio_client |
Launches the MCP server as a subprocess and connects to it |
session.list_tools() |
Dynamically queries the list of tools provided by the server |
| Agent loop | Repeats tool calls and responses until end_turn |
session.call_tool() |
Executes the actual MCP tool |
Example 2: Newsroom Multi-Agent — Agent Collaboration with A2A
Now we use both protocols together. The complexity goes up a level, but understanding this structure reveals the fundamental pattern for production multi-agent systems.
Scenario: While the Reporter agent writes an article, when fact-checking is needed it delegates to the Researcher agent via A2A. The Researcher queries a news database via MCP and returns the results to the Reporter.
The Researcher agent plays two roles simultaneously: an A2A server (receiving tasks from the Reporter) and an MCP client (accessing the news DB). This dual role is the key to this pattern.
Here is the basic skeleton of the A2A server:
# researcher_agent.py — A2A Server + MCP Client
# pip install fastapi uvicorn anthropic mcp
from fastapi import FastAPI, Request
from fastapi.responses import StreamingResponse
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
import anthropic
import json
app = FastAPI()
AGENT_CARD = {
"name": "ResearchAgent",
"version": "1.0.0",
"description": "Handles news article fact-checking and in-depth research",
"url": "http://localhost:8001",
"capabilities": {"streaming": True, "pushNotifications": False},
"skills": [
{
"id": "fact-check",
"name": "Fact Check",
"description": "Verifies claims against a database",
"inputModes": ["text"],
"outputModes": ["text"]
}
]
}
@app.get("/.well-known/agent.json")
async def get_agent_card():
"""Agent Card endpoint — the entry point for other agents to discover capabilities"""
return AGENT_CARDPublishing the Agent Card at /.well-known/agent.json lets other agents automatically discover this agent's capabilities. Here is the core logic that handles actual tasks. When it receives an A2A request, it connects to the MCP server, calls the tools, and streams results via SSE.
@app.post("/a2a")
async def handle_a2a_request(request: Request):
body = await request.json()
task_text = body["params"]["message"]["parts"][0]["text"]
async def generate_response():
# — MCP client role: connect to the news DB MCP server —
server_params = StdioServerParameters(
command="python",
args=["news_db_mcp_server.py"]
)
async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
await session.initialize()
tools_result = await session.list_tools()
tools = [
{
"name": t.name,
"description": t.description,
"input_schema": t.inputSchema
}
for t in tools_result.tools
]
client = anthropic.Anthropic()
messages = [{"role": "user", "content": task_text}]
# Stream task start status to the Reporter
yield f"data: {json.dumps({'status': 'working', 'message': 'Fact-check started'})}\n\n"
while True:
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
tools=tools,
messages=messages
)
if response.stop_reason == "end_turn":
result_text = next(
b.text for b in response.content
if b.type == "text"
)
# Stream the final result to the Reporter
yield f"data: {json.dumps({'status': 'completed', 'result': result_text})}\n\n"
break
# — Handle MCP tool calls —
tool_uses = [b for b in response.content if b.type == "tool_use"]
messages.append({"role": "assistant", "content": response.content})
tool_results = []
for tool_use in tool_uses:
yield f"data: {json.dumps({'status': 'working', 'message': f'Querying {tool_use.name}'})}\n\n"
result = await session.call_tool(tool_use.name, arguments=tool_use.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": tool_use.id,
"content": str(result.content)
})
messages.append({"role": "user", "content": tool_results})
return StreamingResponse(generate_response(), media_type="text/event-stream")Here is the code for the Reporter agent delegating work to the Researcher via A2A:
# reporter_agent.py
# pip install httpx
import httpx
import json
import uuid
async def write_article_with_fact_check(topic: str) -> str:
"""Delegates fact-checking needed during article writing to the Researcher."""
# 1. Read the Researcher's Agent Card to confirm capabilities
async with httpx.AsyncClient() as client:
card_response = await client.get(
"http://localhost:8001/.well-known/agent.json"
)
agent_card = card_response.json()
print(f"Collaborating agent: {agent_card['name']} — {agent_card['description']}")
# 2. Delegate the fact-check task via A2A
fact_check_request = {
"jsonrpc": "2.0",
"method": "tasks/send",
"params": {
"id": str(uuid.uuid4()), # uuid4 is safe without collisions
"message": {
"role": "user",
"parts": [{
"kind": "text",
"text": f"Please verify the key facts on the following topic: {topic}"
}]
}
},
"id": 1
}
verified_facts = "" # The result received from SSE is a string
async with httpx.AsyncClient(timeout=60.0) as client:
async with client.stream(
"POST",
"http://localhost:8001/a2a",
json=fact_check_request,
headers={"Accept": "text/event-stream"}
) as response:
async for line in response.aiter_lines():
if line.startswith("data: "):
event = json.loads(line[6:])
if event["status"] == "completed":
verified_facts = event["result"]
elif event["status"] == "working":
print(f" In progress: {event['message']}")
# 3. Write the article based on the verified facts
return f"[Article Based on Verified Information]\n\n{verified_facts}"| Code point | Role |
|---|---|
/.well-known/agent.json |
Agent Card endpoint — used for automatic capability discovery |
| SSE streaming | Delivers intermediate status of long-running tasks in real time |
tasks/send JSON-RPC |
Sends tasks using the standard A2A method |
| Researcher's internal MCP | Acts as both an A2A server and an MCP client simultaneously |
Example 3: Full Dual-Stack Architecture Diagram
User request
│
▼
Reporter Agent (MCP Client + A2A Client)
│
├── MCP ──► CRM MCP Server
│ └── CRM Database
│
├── MCP ──► KnowledgeBase MCP Server
│ └── Vector DB
│
└── A2A ──► Researcher Agent (A2A Server + MCP Client)
│
└── MCP ──► NewsDB MCP Server
└── ElasticsearchIn this structure, each agent communicates in two directions:
- Vertical direction (MCP): Accesses its own tools and data via a standard interface
- Horizontal direction (A2A): Delegates tasks to other specialized agents
Pros and Cons Analysis
Advantages
| Item | Description | Notes |
|---|---|---|
| Standardization | OpenAI, Microsoft, and Google have all adopted MCP, resolving ecosystem fragmentation | Broad support across major LLM platforms post-2025 |
| Reusability | Once you build an MCP server, any agent can use it in the same way | No duplicate tool implementations |
| Heterogeneous collaboration | Thanks to A2A, Python and Java agents can collaborate regardless of framework | Vendor-lock-free collaboration |
| Async support | A2A's SSE streaming handles long-running tasks naturally | Status polling also supported via tasks/get |
| Open governance | MCP was donated to the Linux Foundation's AAIF in December 2025, and A2A is also operated by the open-source community | Long-term neutrality guaranteed |
| Mature SDKs | TypeScript/Python MCP SDKs already have production-level completeness | Enterprise requirements met with the 2025-11-25 spec |
Disadvantages and Caveats
| Item | Description | Mitigation |
|---|---|---|
| MCP security | Prompt injection attack vectors are being researched where malicious MCP servers inject hidden instructions into system prompts to manipulate agent behavior | Apply principle of least privilege, strengthen input validation, connect only trusted MCP servers |
| Immature A2A ecosystem | Released in April 2025, so production references are still limited | Leverage supporting frameworks such as CrewAI v1.10 and Microsoft Agent Framework 1.0 |
| Debugging difficulty | Systems with overlapping MCP and A2A are extremely difficult to debug without distributed tracing | Mandatory adoption of OpenTelemetry-based distributed tracing |
| Server implementation variance | MCP server implementation quality varies widely | Use the official SDK, choose community-vetted servers |
| Vendor lock-in risk | Relying on a specific cloud SDK (Google ADK, Vertex AI, etc.) can create lock-in | Design with the protocol layer and cloud layer separated |
Principle of Least Privilege: A security principle that designs each component of a system to have only the minimum permissions necessary to perform its task. MCP servers should expose only the tools that agents actually need.
Distributed Tracing: A technique for tracking the flow of requests across multiple components. In an MCP+A2A system, you need to be able to see the full flow of which agent called which tool and when. Attaching OpenTelemetry from the start can save a lot of pain later.
The Most Common Mistakes in Practice
-
Introducing MCP and A2A simultaneously from the start. When I tried to apply both at once, I couldn't track where things were breaking. It's far smoother to first stabilize tool integration for a single agent with MCP, then expand to A2A.
-
Managing Agent Cards statically. When an agent's capabilities change, the Agent Card must be updated along with it. Generating it dynamically at runtime instead of hardcoding is much more maintainable.
-
Not accounting for error propagation. If you don't design upfront how the calling agent handles a failed A2A-delegated task, errors may be silently swallowed or the entire system may become blocked. It's recommended to define timeout and fallback strategies in advance.
Closing Thoughts
MCP standardizes the agent's arms, and A2A standardizes the agent's language — only when these two protocols are combined does a truly multi-agent system come together.
Here are 3 steps you can take right now:
-
Start by building an MCP server. Install
pip install fastmcp, then wrap an API or database you use frequently with a single@mcp.tool()decorator. The example servers on the official GitHub are a great starting point. -
Connect that MCP server from Claude or another LLM agent. Configure a client with
@modelcontextprotocol/sdk(TypeScript) or themcpPyPI package, and implement the loop where the agent actually calls tools yourself — this gives you a hands-on feel for how MCP works. -
Run the sample from the official A2A repository. GitHub a2aproject/A2A has Python and TypeScript reference implementations, and the community-contributed tutorial a2a-mcp-tutorial has hands-on examples using both protocols together.
Next article: We'll be exploring architecture patterns for applying MCP+A2A multi-agent systems to large-scale real-time processing using Apache Kafka as an event broker.
References
Official Specifications
- Model Context Protocol Official Specification (2025-11-25)
- Agent2Agent Protocol Official Specification
- The Relationship Between A2A and MCP (Official Docs)
Official Repositories
Blogs & Case Studies
- Google Developers Blog — A2A Announcement
- GitHub — A2A+MCP Hands-on Tutorial (Community Contribution)
- Auth0 — MCP vs A2A Comparison Guide
- Elastic — A2A+MCP Newsroom Agent Case Study
- Kai Waehner — Apache Kafka + A2A + MCP Architecture
- Infinitus — MCP and A2A in Healthcare
- IBM — Agent2Agent Protocol Overview
- DEV Community — Complete Guide to MCP 2026