How to Make LLMs Directly Call Your Internal REST APIs: TypeScript MCP Server Implementation and the Gateway Pattern
Have you ever tried to introduce an AI agent to your team, only to get stuck on the question "so how do we connect our internal APIs?" I started out trying to paste entire internal API docs into the LLM, or putting raw fetch code directly into the prompt. It works—exactly once. From the second request onward, the LLM starts inventing endpoint URLs, forgetting authentication headers, or sending parameters that don't exist.
Wrapping with an MCP (Model Context Protocol) server is an architectural pattern that lets LLMs call internal systems in a structured way without touching existing API logic. Instead of cramming API specs into a prompt, the LLM calls functions directly through a standardized Tool interface. By the end of this article, you'll be able to write TypeScript code that exposes existing REST APIs as MCP Tools. This guide assumes a basic Node.js + TypeScript environment is already set up.
With Claude Code, Cursor, and GitHub Copilot all operating as MCP clients today, exposing internal systems via MCP is also an exercise in building AI productivity infrastructure for your entire team.
Core Concepts
What Is MCP Wrapping?
The idea itself is simple. You add a thin JSON-RPC-based interface layer in front of an existing REST API. Through this layer, the LLM can clearly identify "which functions it can call" and "what each parameter is," and invoke them in a type-safe manner.
Terminology: JSON-RPC is a remote procedure call protocol that uses JSON. MCP builds on this to define a standard communication method between LLMs and external tools.
MCP provides three basic abstractions:
| Abstraction | Role | Internal API Mapping Example |
|---|---|---|
| Tool | A function through which the LLM triggers side effects or requests computation | createTicket, triggerDeploy |
| Resource | Read-only data exposure | Internal wiki, API docs, codebase |
| Prompt | Reusable prompt templates | Domain-specific instructions |
When wrapping internal APIs, Tool is what matters most in practice. The work mostly involves mapping internal API endpoints to Tools one-to-one, or grouping multiple endpoints into a single meaningful Tool.
Transport: stdio vs Streamable HTTP
Transport choice determines how the server is deployed. As of March 2025, the legacy HTTP+SSE approach has been officially deprecated, leaving two options:
| Transport | Use Case | Characteristics |
|---|---|---|
stdio |
Local development, single-user CLI tools | Standard I/O between processes, simple setup |
| Streamable HTTP | Team- or org-scale remote servers | HTTP-based, OAuth 2.1 required, scalable |
For exposing internal APIs to an entire team, the recommended approach is Streamable HTTP + OAuth 2.1. A practical sequence is to start with stdio for fast local testing, then switch to Streamable HTTP when promoting to a shared team server.
The Key to Schema Design
I initially exposed the nested object structures of my internal APIs directly as Tool parameters, and ran into a situation where the LLM was sending metadata.labels[0].value as a nonexistent field called labelsValue. After that, I adopted parameter flattening as a principle. In practice, schema ambiguity = LLM Tool miscalls is an equation that holds up pretty often.
Before — exposing nested objects directly (LLM frequently miscalls):
{
metadata: z.object({
labels: z.array(z.object({
key: z.string(),
value: z.string()
}))
})
}After — flattened (LLM fills in accurately):
{
label_key: z.string().describe("Label key (e.g., env, team)"),
label_value: z.string().describe("Label value (e.g., production, backend)")
}The principles for a good Tool schema in summary:
- Flatten parameters as much as possible — simple fields over nested objects
- Use
describe()to specify the meaning and allowed values of each parameter - Make active use of enum types to constrain the LLM's choices
Pros and Cons Analysis
It's worth examining the tradeoffs before deciding to adopt MCP wrapping.
Advantages
| Item | Description |
|---|---|
| Fast integration | Add only an MCP interface layer without rewriting existing API logic |
| Multi-client support | Callable identically from any MCP-compatible client: Claude, GPT, Copilot, Cursor, etc. |
| OpenAPI automation | Minimize manual Tool writing if you already have a spec |
| Standard security layer | OAuth 2.1-based authentication enforced at the protocol level |
| Agent productivity | Automate complex internal tasks with natural-language instructions |
Disadvantages and Caveats
| Item | Description | Mitigation |
|---|---|---|
| Prompt injection | Malicious instructions in Tool descriptions or return values can cause the LLM to behave unintentionally | Design with the principle of not trusting returned data; output sanitization |
| Context pollution | Too many Tools fill the LLM context window with Tool descriptions, degrading performance | Separate servers by domain; consider dynamic Tool loading |
| Schema complexity | Complex internal API objects require redesign to simplify | Flattened parameter structures, active use of enums |
| Missing audit logs | Without records of AI agent API calls, compliance auditing is impossible | Log all calls at the MCP server or gateway level |
| Over-privileged access | A single MCP server holding broader API access than necessary | Apply the principle of least privilege per Tool |
Terminology: A Tool Poisoning Attack is an attack method that embeds malicious LLM instructions into an MCP Tool's description or return value to cause the agent to perform unintended actions. Because data coming from outside (user input, external API responses) can appear in Tool return values, this is a threat that can't be ignored even in internal systems.
The Most Common Mistakes in Practice
-
Writing lazy Tool descriptions — A one-liner like
"creates a ticket"makes it hard for the LLM to understand when and how to use the Tool. Usingdescribe()to specify concrete examples and allowed values for each parameter directly determines Tool call accuracy. -
Accepting auth tokens as Tool parameters — Having the LLM handle tokens directly risks exposing them in logs or having them stolen via prompt injection. It's recommended to inject credentials from server environment variables and never expose them in the Tool interface.
-
Deploying
stdioservers to the whole team as-is — Using a localstdioserver as-is for team-wide use exposes internal APIs with no authentication, rate limiting, or audit logs. Once you reach team scale, switching to Streamable HTTP + OAuth 2.1 is recommended.
Practical Application
Direct TypeScript SDK Implementation: Wrapping an Internal Issue Tracker as a Tool
Wrap the REST API of Jira, Linear, or your own ticket system as an MCP Tool, and just saying "create a P1 ticket for this bug" to your editor AI will actually create the issue.
First, install the dependencies:
pnpm add @modelcontextprotocol/sdk zod axiosNext, the server code. Where internalClient comes from can be confusing at first glance — it's an axios instance initialized from environment variables:
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";
import axios from "axios";
// Auth token is injected from environment variables — not accepted as a Tool parameter
const internalClient = axios.create({
baseURL: process.env.INTERNAL_API_URL,
headers: {
Authorization: `Bearer ${process.env.API_TOKEN}`
}
});
const server = new McpServer({ name: "internal-issue-tracker", version: "1.0.0" });
server.tool(
"create_ticket",
"Creates a new ticket in the internal issue tracker",
{
title: z.string().describe("Ticket title (clear and concise)"),
priority: z.enum(["P1", "P2", "P3"]).describe("Priority: P1=urgent, P2=high, P3=normal"),
assignee: z.string().email().optional().describe("Assignee's email address")
},
async ({ title, priority, assignee }) => {
const res = await internalClient.post("/issues", { title, priority, assignee });
return {
content: [{ type: "text", text: `Ticket created: ${res.data.id} — ${res.data.url}` }]
};
}
);
// Start server — uses stdio transport for local development
const transport = new StdioServerTransport();
await server.connect(transport);| Point | Description |
|---|---|
z.enum(["P1", "P2", "P3"]) |
Locks the choices so the LLM cannot insert arbitrary values |
.describe() |
Communicates the meaning of each parameter directly to the LLM |
axios.create(...) |
Handles auth headers at the client level — not exposed in the Tool interface |
server.connect(transport) |
This line is required for the server to actually run |
Automatic OpenAPI Spec Conversion: Building an MCP Server Without Manual Work
If your internal API already has an OpenAPI (Swagger) spec, you don't need to write Tools by hand. In a TypeScript environment, you can use a CLI tool:
npx openapi-mcp-generator --input ./api-spec.yaml --output ./mcp-serverIn the Python ecosystem, FastMCP's OpenAPI integration is the fastest option:
from fastmcp import FastMCP
from your_internal_api import app # existing FastAPI app
# Convert FastAPI app to an MCP server
mcp = FastMCP.from_fastapi(app=app)
if __name__ == "__main__":
mcp.run()One thing worth being honest about: writing from your_internal_api import app on one line looks simple, but in reality it brings along all the dependencies, environment variables, DB connections, etc. of that FastAPI app. If you can run it in the same Python environment as the existing app, this is the fastest approach; if the environments are separated, it's more practical to export the OpenAPI spec file and process it with openapi-mcp-generator.
The operationId, summary, and description from your OpenAPI spec map directly to Tool names and descriptions. Since spec quality becomes MCP Tool quality, this is especially effective for teams that already maintain a well-managed OpenAPI spec.
Scaling to Team Size: The MCP Gateway Pattern
When you have multiple teams and each domain starts running its own MCP server, authentication and audit logs become fragmented. Honestly, at this point the thought crosses your mind — "can't we just hardcode tokens in each server?" — but that choice comes back to bite you during a security audit.
The MCP Gateway is a pattern that solves this problem. The Gateway acts as a reverse proxy at the MCP protocol level. When an AI agent connects to the single endpoint (the Gateway), the Gateway handles OAuth 2.1 authentication and then internally routes Tool calls to each domain MCP server. The domain servers don't need to be exposed to the internet — they only need to exist behind the Gateway.
[Claude / AI Agent]
|
↓ (single connection point)
[MCP Gateway]
- OAuth 2.1 authentication handling
- Rate limiting
- Audit logs (records all Tool calls)
- Routes to domain-specific servers
|
_____|_____
| | |
↓ ↓ ↓
[HR [CI/CD [Analytics
API] API] API]
MCP MCP MCP
Srv Srv Srv
(internal network only)Tools validated by real-world use cases:
| Tool | Characteristics |
|---|---|
| Kong AI MCP Proxy | Bridges existing HTTP APIs to MCP; integrated rate limiting and authentication |
| Azure API Management + Entra ID | MCP + AD federation in Microsoft stack environments |
| mcp-gateway-registry | Open-source gateway registry with Keycloak/Entra integration |
Behind the roughly 970x growth in MCP SDK monthly downloads over 18 months is this kind of enterprise proliferation of the Gateway pattern. As of 2026, this pattern is becoming the de facto standard for in-house AI infrastructure at team scale and above.
Closing Thoughts
Wrapping internal APIs with MCP creates infrastructure that lets your entire team interact with internal systems through natural language via AI agents, while leaving existing code untouched.
Here are 3 steps you can start with right now. Choose based on your situation:
-
If your internal API has an OpenAPI spec, you can generate skeleton code first with
npx openapi-mcp-generator --input ./your-api-spec.yaml --output ./mcp-server. In this case, you can skip step 2 (manual Tool writing). -
If you don't have a spec, install
pnpm add @modelcontextprotocol/sdk zod axiosand use the TypeScript example above as a reference to convert one API your team uses daily (ticket creation, deployment status check, etc.) into a single Tool. Even one Tool is enough to experience an agent turning a natural-language instruction into an actual API call. -
Connect a local server to Claude Code or Cursor's MCP settings and use it directly. Watching which parameters the agent fills when it calls a Tool will immediately reveal what needs to be improved in your schema. After going through this step, you'll feel firsthand why
describe()and enums matter.
References
Recommended starting points:
- Wrapping an Existing API with MCP: How to Expose Your Current APIs to LLMs | Gun.io
- How to build MCP servers with TypeScript SDK | DEV Community
- OpenAPI 🤝 FastMCP | FastMCP Official Docs
For deeper learning:
- Should you wrap MCP around your existing API? | Scalekit
- MCP Best Practices: Architecture & Implementation Guide | modelcontextprotocol.info
- From OpenAPI Spec to MCP Server: A Practical Guide | Xata
- API MCP Server Architecture Guide | Stainless
- What Is an MCP Gateway and Why Your Enterprise Needs One in 2026 | Composio
- Advanced authentication and authorization for MCP Gateway | Red Hat Developer
- Understanding Authorization in MCP | MCP Official Docs
- MCP Server Security Best Practices: 2026 Engineering Guide | Digital Applied
- Model Context Protocol has prompt injection security problems | Simon Willison
- From REST to MCP: An Empirical Study of API Wrapping | arXiv