In 2024, AI agent tool usage was a jungle. Every framework had its own way of defining, invoking, and securing tools. Interoperability was non-existent. Your LangChain agent couldn’t use your AutoGen tools — and why would it? There was no standard.
Then came the Model Context Protocol (MCP) — Anthropic’s open proposal for a universal way to expose tools and context to AI agents. In 2025, it gained adoption. In 2026, it’s becoming the default layer between AI agents and their world.
We run AI infra at scale. We’ve built and deployed over 200 agent pipelines using LangGraph, CrewAI, and custom orchestrators. For the past six months, every new agent we deploy uses MCP to access tools. Not because it’s trendy — because it solves real production problems.
This post breaks down MCP: what it is, why it matters, how to implement it, and where it fits in a production architecture. No fluff. Just what works.
What is the Model Context Protocol (MCP)?
MCP is a JSON-RPC 2.0-based protocol that defines how AI agents discover, request, and use external tools and context. It standardizes the contract between an AI model (or orchestration framework) and backend services — databases, APIs, internal tools, files — anything an agent needs.
Think of it as gRPC for AI agents, but focused solely on context delivery. Instead of building API wrappers for every tool in every framework, you expose an MCP server. Any MCP-compliant client can then access those tools.
The core components:
- MCP Server: A long-running service that exposes tools via the MCP interface. It implements endpoints like
list_resources,get_resource_data, andcall_tool. - MCP Client: A library (e.g., in LangChain) that speaks MCP to query servers and invoke tools.
- MCP Host: The runtime environment (e.g., a LangChain Executor) that manages client connections and routes requests.
This decoupling is powerful. Your agent framework no longer needs to know how a tool works — only how to speak MCP.
Why MCP Beats Ad-Hoc Tool Calling
Before MCP, every agent framework handled tools differently:
- LangChain: Python functions with docstrings
- AutoGen:
@register_functiondecorators - CrewAI: Tools as separate Python classes
This created tool silos. You couldn’t reuse a CrewAI tool in a LangChain agent. Every framework reinvented auth, rate limiting, logging.
MCP changes that by providing a unified interface. Here’s what it gives you out of the box:
-
Cross-Framework Compatibility Your MCP server works with LangChain, LiteLLM, and any future framework that supports MCP. No rewrites.
-
Built-in Discovery Clients can query
list_resourcesto see what’s available. This enables dynamic agent behavior — e.g., an agent can discover a new CRM tool was added and start using it immediately. -
Secure Context Delivery MCP supports OAuth2, API keys, and JWT-based authentication. You control precisely which agents access which tools and under what conditions.
-
Structured Logging & Observability Every MCP request is a standardized JSON-RPC call. This makes logging, monitoring, and auditing trivial compared to scattered function calls.
-
Performance & Latency Control MCP clients support streaming, batched requests, and caching. You can tune data delivery for latency or cost.
We’ve seen up to a 40% reduction in integration time when adopting MCP for new tools. The initial setup cost is real, but the long-term payoff in maintainability is undeniable.
Building Your First MCP Server (Python)
Let’s build a simple MCP server in Python using the official mcp SDK. This server will expose three tools:
fetch_weather— Get weather data by city (REST API call)get_user_profile— Retrieve a user’s data (mocked)send_notification— Trigger a push notification
First, install the SDK:
pip install model-context-protocolNow, the server code:
from mcp import MCPServer, Tool, Parameter
import requests
import json
# In-memory mock data
USERS = {
"u123": {"name": "Alice", "email": "[email protected]", "city": "Stockholm"}
}
# Define tools
fetch_weather = Tool(
name="fetch_weather",
description="Get current weather data for a city",
parameters=[
Parameter(
name="city",
type="string",
description="City name (e.g., Stockholm, London)",
required=True
)
],
function=lambda city: {
"temperature": 18,
"condition": "Partly cloudy",
"humidity": 65
}
)
get_user_profile = Tool(
name="get_user_profile",
description="Retrieve a user's profile information",
parameters=[
Parameter(
name="user_id",
type="string",
description="Unique identifier for the user",
required=True
)
],
function=lambda user_id: USERS.get(user_id, {"error": "User not found"})
)
send_notification = Tool(
name="send_notification",
description="Send a push notification to a user",
parameters=[
Parameter(
name="user_id",
type="string",
required=True
),
Parameter(
name="title",
type="string",
required=True
),
Parameter(
name="body",
type="string",
required=True
)
],
function=lambda user_id, title, body: {
"status": "sent",
"message_id": "notif_987"
}
)
# Create and start server
server = MCPServer(
server_name="The Agent Fabric MCP Hub",
version="1.0.0"
)
server.add_tool(fetch_weather)
server.add_tool(get_user_profile)
server.add_tool(send_notification)
if __name__ == "__main__":
server.run(host="0.0.0.0", port=8080)Start it with:
python mcp_server.pyYour server is now live at http://localhost:8080. It automatically exposes the MCP specification at /spec.
Connecting from a LangChain Agent
Now, let’s connect a LangChain agent to this MCP server. Install required packages:
pip install langchain langchain-communityHere’s the client code:
from langchain_community.tools import MCPClientWrapper
from langchain_core.agents import AgentAction
from langchain_core.language_models import LanguageModelInput
from langchain_core.messages import HumanMessage
from langchain_core.prompts import PromptTemplate
from langchain_core.runnables import RunnableLambda
from langchain_openai import ChatOpenAI
# Wrap the MCP server
mcp_tool = MCPClientWrapper(
url="http://localhost:8080/spec", # MCP spec endpoint
headers={"Authorization": "Bearer sk-abc123"} # Optional auth
)
# Define a simple agent loop
llm = ChatOpenAI(model="gpt-4o", temperature=0)
prompt = PromptTemplate.from_template(
"You are a helpful assistant. Use the provided tools only when necessary.\n\nQuestion: {input}\n\nAvailable tools: {tool_names}\n\nUse the following format:\n\nQuestion: the input question you must answer\nThought: you should always think about what to do\nAction: the action to take, must be one of: {tool_names}\nAction Input: the input to the action\nObservation: the result of the action\n... (this Thought/Action/Action Input/Observation can repeat)
Thought: you should think about the final answer\nFinal Answer: your final answer to the original input question"
)
def agent_step(state: dict) -> dict:
messages = state["messages"]
# Format prompt with available tools
formatted = prompt.format(
input=messages[-1].content,
tool_names=", ".join([t.name for t in mcp_tool.get_available_tools()])
)
response = llm.invoke(formatted)
# Parse LLM output to detect tool use
if "Action:" in response.content:
action_line = response.content.split("Action:")[1].split("\n")[0].strip()
if action_line in [t.name for t in mcp_tool.get_available_tools()]:
# Extract action input
input_line = response.content.split("Action Input:")[1].split("\n")[0].strip()
tool_response = mcp_tool._run(action_line, input_line)
messages.append(response)
messages.append(HumanMessage(content=f"Observation: {tool_response}"))
else:
messages.append(response)
else:
messages.append(response)
return {"messages": messages}
# Example usage
initial_state = {"messages": [HumanMessage(content="What's the weather like in Stockholm?")]}
result = agent_step(initial_state)
# The agent should now use fetch_weather
print(result["messages"][-1].content)The agent will invoke fetch_weather through your MCP server. No direct integration needed — the MCP layer handles everything.
Production Considerations for MCP
MCP is powerful, but running it in production requires care. Here’s what we do:
1. Auth and Scope Management
We use OAuth2 with short-lived tokens. Each agent service account gets scoped access — e.g., a customer support agent can read user profiles but not send payments. We’ve seen security breaches from over-permissive tools; MCP makes it easier to apply least privilege.
2. Rate Limiting and Circuit Breaking
Our MCP servers implement Redis-based rate limiting. We also use Resilience4j patterns: if an external API fails repeatedly, the MCP server breaks the circuit and returns a cached response or error — preventing cascading failures in the agent fleet.
3. Logging and Observability
Every MCP request logs:
client_idtool_nameduration_msinput_tokens,output_tokens(when applicable)status(success/error)
This flows into our Langfuse setup. We can trace an entire agent decision back to specific tool calls, down to latency and cost.
4. Versioning and Backward Compatibility
We version our MCP servers (e.g., v1, v2). Breaking changes go to a new URL. Clients can specify which version they support. We keep old versions running for 60 days to allow for rollout.
MCP and the Broader Ecosystem
MCP isn’t just for tool access. It’s becoming the universal context layer for AI systems:
- LiteLLM: Now supports MCP clients — you can route MCP calls through your existing LiteLLM gateway, applying rate limits, model fallbacks, and cost tracking.
- LangChain: Deep MCP integration allows LangChain apps to connect to any MCP server without code changes.
- Custom Orchestrators: Teams building bespoke agent frameworks are adopting MCP as the internal wire protocol.
We run our own AI stack on Hetzner Cloud, and our live agents are built with OpenClaw. We’ve integrated MCP at the gateway level — all external tool access flows through our MCP server. This gives us a single point for security, logging, and policy enforcement.
FAQ: Model Context Protocol
What’s the difference between MCP and OpenAI’s function calling?
OpenAI’s function calling is model-specific and JSON Schema-based. It’s great for simple, model-driven tool use. MCP is framework- and model-agnostic, designed for complex, multi-agent systems with shared tool ecosystems.
Can I use MCP with non-Python agents?
Yes. MCP is language-agnostic. There are SDKs for JavaScript, Go, and Rust. The protocol is JSON-RPC over HTTP, so any language can implement a client or server.
Is MCP production-ready?
Yes, but with caveats. Major platforms like LangChain and LiteLLM support it. Production readiness depends on your implementation — robust auth, observability, and error handling are essential. Don’t ship an MCP server without rate limiting.
How does MCP handle large context or file uploads?
MCP supports file streaming via multipart form data or signed URLs. For large payloads, we recommend using references — e.g., "file_id": "doc_123" — and letting the client fetch the actual content separately.
Where can I see real-world MCP examples?
The official MCP GitHub repo has examples. We’ll be publishing our production MCP server templates on The Agent Fabric soon.
Building AI agents in production? I cover real architectures, mistakes, and wins in the newsletter — subscribe here.