Model Context Protocol (MCP) standardizes how AI agents discover and call external tools. Unlike OpenAI function calling or LangChain tool wrappers, MCP defines a wire protocol with explicit capability negotiation, transport abstraction, and a three-primitive model: resources (read-only data), tools (callable functions), and prompts (reusable templates).
A 30-minute TypeScript implementation reveals where the protocol draws boundaries. The server owns schema validation and execution. The client owns orchestration, retry logic, and conversation state. The transport layer is pluggable but defaults to stdio, which shapes deployment patterns in ways that matter for production agents.
Protocol Primitives and Capability Discovery
MCP servers expose three types of capabilities:
- Resources: Read-only data the model can fetch (files, database rows, API responses). The server advertises URIs and the client requests them by reference.
- Tools: Functions the model can invoke with structured arguments. The server defines input schemas using JSON Schema and returns structured output.
- Prompts: Reusable prompt templates the client surfaces to users. Optional for most servers.
Capability discovery happens during the handshake. The client sends an initialize request. The server responds with a manifest listing available tools, resources, and prompts. The client caches this manifest and uses it to guide the model’s tool selection.
This differs from OpenAI function calling, where you pass the full function schema with every API request. MCP separates discovery from invocation, which reduces token overhead but requires the client to maintain state about what the server can do.
Transport Layer and Deployment Shape
MCP supports three transports:
| Transport | Use Case | Connection Model | Deployment Pattern |
|---|---|---|---|
| stdio | Local tools, CLI agents | Parent process spawns server | Bundled with client, no network |
| HTTP/SSE | Remote tools, web agents | Client polls or streams | Separate service, auth required |
| WebSocket | Bidirectional, low-latency | Persistent connection | Stateful server, connection pooling |
The default is stdio. The client spawns the server as a subprocess, communicates over stdin/stdout, and kills the process when done. This works for Claude Desktop and local Cursor integrations but breaks down for multi-tenant agents or serverless deployments.
HTTP with Server-Sent Events (SSE) is the next step. The server runs as a persistent service. The client opens an SSE stream for server-initiated messages (like progress updates) and sends tool calls via POST. You need to handle authentication, rate limiting, and connection lifecycle yourself.
WebSocket gives you bidirectional streaming but requires the server to track connection state. If the client disconnects mid-tool-call, you need a strategy for cleanup or resumption.
Tool Registration and Schema Enforcement
Here is a minimal tool definition in TypeScript:
import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";
const server = new Server(
{ name: "example-server", version: "1.0.0" },
{ capabilities: { tools: {} } }
);
const QuerySchema = z.object({
sql: z.string().describe("SQL query to execute"),
limit: z.number().optional().default(100),
});
server.setRequestHandler("tools/list", async () => ({
tools: [
{
name: "query_db",
description: "Execute a read-only SQL query",
inputSchema: {
type: "object",
properties: {
sql: { type: "string", description: "SQL query to execute" },
limit: { type: "number", description: "Max rows to return" },
},
required: ["sql"],
},
},
],
}));
server.setRequestHandler("tools/call", async (request) => {
if (request.params.name === "query_db") {
const args = QuerySchema.parse(request.params.arguments);
// Execute query, return structured result
return {
content: [{ type: "text", text: JSON.stringify(rows) }],
};
}
throw new Error(`Unknown tool: ${request.params.name}`);
});
const transport = new StdioServerTransport();
await server.connect(transport);
The server defines the schema twice: once in the tools/list response (JSON Schema for the client) and once in the Zod validator (runtime enforcement). The protocol does not enforce schema validation. If you skip the Zod parse, the model can send malformed arguments and your tool will crash.
The client does not validate inputs before sending them. It trusts the model to follow the schema. This means your server must treat every tool call as untrusted input.
Resource Handling and State Management
Resources are read-only but not necessarily static. A resource URI like db://users/active can return different data on each fetch. The server decides whether to cache results or recompute them.
The protocol does not define cache headers or invalidation signals. If the client wants to cache resource data, it has to implement its own TTL or versioning strategy. The server can include metadata in the response (like a timestamp or ETag) but the client is not required to use it.
This is a deliberate trade-off. MCP keeps the protocol simple and pushes caching logic to the client or an intermediate proxy. For high-frequency resource access, you will want a caching layer between the agent and the MCP server.
Security Boundaries and Rate Limiting
MCP does not include authentication or rate limiting in the protocol. If you run an MCP server over HTTP, you need to add:
- Authentication: API keys, OAuth tokens, or mTLS. The protocol does not specify a mechanism.
- Authorization: Tool-level or resource-level access control. The server must check permissions before executing.
- Rate limiting: Per-client or per-tool quotas. The protocol does not define error codes for rate limit violations, so you return a generic error and the client decides whether to retry.
For stdio transports, security is simpler. The server runs in the same security context as the client. If the client can read a file, the server can too. This works for local tools but not for multi-user systems.
Observability and Failure Modes
MCP servers are black boxes to the client. The protocol defines error responses but not structured logging or telemetry. If a tool call fails, the server returns an error message and the client logs it. You do not get stack traces, execution time, or resource usage unless you instrument the server yourself.
Common failure modes:
- Schema mismatch: The model sends arguments that do not match the schema. The server crashes or returns a validation error. The client retries or gives up.
- Timeout: The tool takes too long. The client kills the stdio process or closes the HTTP connection. The server may continue running and waste resources.
- Partial failure: The tool succeeds but returns incomplete data. The protocol does not define a way to signal partial success, so the server returns what it has and the client decides if it is enough.
For production agents, you need external observability. Log tool calls to a structured log sink. Emit metrics for latency, error rate, and token usage. Use distributed tracing if the server calls other services.
Comparison with Existing Tool Patterns
| Aspect | MCP | OpenAI Function Calling | LangChain Tools |
|---|---|---|---|
| Schema definition | JSON Schema, advertised once | JSON Schema, sent per request | Python type hints or Pydantic |
| Transport | stdio, HTTP, WebSocket | HTTP only | In-process or HTTP |
| Discovery | Explicit handshake | Implicit in request | Registry or decorator |
| Validation | Server-side, optional | Client-side, model-enforced | Framework-enforced |
| State management | Client owns conversation state | Stateless per request | Framework tracks state |
MCP is closer to a microservice protocol than a library. You write a standalone server, deploy it separately, and connect clients via a standard interface. LangChain tools are in-process and tightly coupled to the orchestration framework. OpenAI function calling is stateless and requires the client to manage conversation history.
Technical Verdict
Use MCP when:
- You need to expose tools to multiple agent frameworks (Claude Desktop, Cursor, custom clients).
- You want to deploy tools as independent services with separate scaling and security boundaries.
- You are building a tool marketplace or plugin ecosystem where third parties contribute servers.
Avoid MCP when:
- You are building a single-agent system with tight coupling between orchestration and tools. In-process function calls are simpler.
- You need sub-100ms tool latency and cannot tolerate the overhead of a separate process or network hop.
- Your tools require complex stateful interactions that do not map cleanly to request-response patterns.
The 30-minute setup time is real but misleading. You can scaffold a server quickly, but production-ready MCP infrastructure requires authentication, rate limiting, observability, and a deployment strategy that matches your transport choice. The protocol gives you standardization at the cost of operational complexity.