How to Make LLMs Directly Call Your Internal REST APIs: TypeScript MCP Server Implementation and the Gateway Pattern

Have you ever tried to introduce an AI agent to your team, only to get stuck on the question "so how do we connect our internal APIs?" I started out trying to paste entire internal API docs into the LLM, or putting raw fetch code directly into the prompt. It works—exactly once. From the second request onward, the LLM starts inventing endpoint URLs, forgetting authentication headers, or sending parameters that don't exist.

Wrapping with an MCP (Model Context Protocol) server is an architectural pattern that lets LLMs call internal systems in a structured way without touching existing API logic. Instead of cramming API specs into a prompt, the LLM calls functions directly through a standardized Tool interface. By the end of this article, you'll be able to write TypeScript code that exposes existing REST APIs as MCP Tools. This guide assumes a basic Node.js + TypeScript environment is already set up.

With Claude Code, Cursor, and GitHub Copilot all operating as MCP clients today, exposing internal systems via MCP is also an exercise in building AI productivity infrastructure for your entire team.

Core Concepts

What Is MCP Wrapping?

The idea itself is simple. You add a thin JSON-RPC-based interface layer in front of an existing REST API. Through this layer, the LLM can clearly identify "which functions it can call" and "what each parameter is," and invoke them in a type-safe manner.

Terminology: JSON-RPC is a remote procedure call protocol that uses JSON. MCP builds on this to define a standard communication method between LLMs and external tools.

MCP provides three basic abstractions:

Abstraction	Role	Internal API Mapping Example
Tool	A function through which the LLM triggers side effects or requests computation	`createTicket`, `triggerDeploy`
Resource	Read-only data exposure	Internal wiki, API docs, codebase
Prompt	Reusable prompt templates	Domain-specific instructions

When wrapping internal APIs, Tool is what matters most in practice. The work mostly involves mapping internal API endpoints to Tools one-to-one, or grouping multiple endpoints into a single meaningful Tool.

Transport: `stdio` vs Streamable HTTP

Transport choice determines how the server is deployed. As of March 2025, the legacy HTTP+SSE approach has been officially deprecated, leaving two options:

Transport	Use Case	Characteristics
`stdio`	Local development, single-user CLI tools	Standard I/O between processes, simple setup
Streamable HTTP	Team- or org-scale remote servers	HTTP-based, OAuth 2.1 required, scalable

For exposing internal APIs to an entire team, the recommended approach is Streamable HTTP + OAuth 2.1. A practical sequence is to start with stdio for fast local testing, then switch to Streamable HTTP when promoting to a shared team server.

The Key to Schema Design

I initially exposed the nested object structures of my internal APIs directly as Tool parameters, and ran into a situation where the LLM was sending metadata.labels[0].value as a nonexistent field called labelsValue. After that, I adopted parameter flattening as a principle. In practice, schema ambiguity = LLM Tool miscalls is an equation that holds up pretty often.

Before — exposing nested objects directly (LLM frequently miscalls):

typescript

{
  metadata: z.object({
    labels: z.array(z.object({
      key: z.string(),
      value: z.string()
    }))
  })
}

After — flattened (LLM fills in accurately):

typescript

{
  label_key: z.string().describe("Label key (e.g., env, team)"),
  label_value: z.string().describe("Label value (e.g., production, backend)")
}

The principles for a good Tool schema in summary:

Flatten parameters as much as possible — simple fields over nested objects
Use describe() to specify the meaning and allowed values of each parameter
Make active use of enum types to constrain the LLM's choices

Pros and Cons Analysis

It's worth examining the tradeoffs before deciding to adopt MCP wrapping.

Advantages

Item	Description
Fast integration	Add only an MCP interface layer without rewriting existing API logic
Multi-client support	Callable identically from any MCP-compatible client: Claude, GPT, Copilot, Cursor, etc.
OpenAPI automation	Minimize manual Tool writing if you already have a spec
Standard security layer	OAuth 2.1-based authentication enforced at the protocol level
Agent productivity	Automate complex internal tasks with natural-language instructions

Disadvantages and Caveats

Item	Description	Mitigation
Prompt injection	Malicious instructions in Tool descriptions or return values can cause the LLM to behave unintentionally	Design with the principle of not trusting returned data; output sanitization
Context pollution	Too many Tools fill the LLM context window with Tool descriptions, degrading performance	Separate servers by domain; consider dynamic Tool loading
Schema complexity	Complex internal API objects require redesign to simplify	Flattened parameter structures, active use of enums
Missing audit logs	Without records of AI agent API calls, compliance auditing is impossible	Log all calls at the MCP server or gateway level
Over-privileged access	A single MCP server holding broader API access than necessary	Apply the principle of least privilege per Tool

Terminology: A Tool Poisoning Attack is an attack method that embeds malicious LLM instructions into an MCP Tool's description or return value to cause the agent to perform unintended actions. Because data coming from outside (user input, external API responses) can appear in Tool return values, this is a threat that can't be ignored even in internal systems.

The Most Common Mistakes in Practice

Writing lazy Tool descriptions — A one-liner like "creates a ticket" makes it hard for the LLM to understand when and how to use the Tool. Using describe() to specify concrete examples and allowed values for each parameter directly determines Tool call accuracy.
Accepting auth tokens as Tool parameters — Having the LLM handle tokens directly risks exposing them in logs or having them stolen via prompt injection. It's recommended to inject credentials from server environment variables and never expose them in the Tool interface.
Deploying stdio servers to the whole team as-is — Using a local stdio server as-is for team-wide use exposes internal APIs with no authentication, rate limiting, or audit logs. Once you reach team scale, switching to Streamable HTTP + OAuth 2.1 is recommended.

Practical Application

Direct TypeScript SDK Implementation: Wrapping an Internal Issue Tracker as a Tool

Wrap the REST API of Jira, Linear, or your own ticket system as an MCP Tool, and just saying "create a P1 ticket for this bug" to your editor AI will actually create the issue.

First, install the dependencies:

bash

pnpm add @modelcontextprotocol/sdk zod axios

Next, the server code. Where internalClient comes from can be confusing at first glance — it's an axios instance initialized from environment variables:

typescript

import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";
import axios from "axios";
 
// Auth token is injected from environment variables — not accepted as a Tool parameter
const internalClient = axios.create({
  baseURL: process.env.INTERNAL_API_URL,
  headers: {
    Authorization: `Bearer ${process.env.API_TOKEN}`
  }
});
 
const server = new McpServer({ name: "internal-issue-tracker", version: "1.0.0" });
 
server.tool(
  "create_ticket",
  "Creates a new ticket in the internal issue tracker",
  {
    title: z.string().describe("Ticket title (clear and concise)"),
    priority: z.enum(["P1", "P2", "P3"]).describe("Priority: P1=urgent, P2=high, P3=normal"),
    assignee: z.string().email().optional().describe("Assignee's email address")
  },
  async ({ title, priority, assignee }) => {
    const res = await internalClient.post("/issues", { title, priority, assignee });
    return {
      content: [{ type: "text", text: `Ticket created: ${res.data.id} — ${res.data.url}` }]
    };
  }
);
 
// Start server — uses stdio transport for local development
const transport = new StdioServerTransport();
await server.connect(transport);

Point	Description
`z.enum(["P1", "P2", "P3"])`	Locks the choices so the LLM cannot insert arbitrary values
`.describe()`	Communicates the meaning of each parameter directly to the LLM
`axios.create(...)`	Handles auth headers at the client level — not exposed in the Tool interface
`server.connect(transport)`	This line is required for the server to actually run

Automatic OpenAPI Spec Conversion: Building an MCP Server Without Manual Work

If your internal API already has an OpenAPI (Swagger) spec, you don't need to write Tools by hand. In a TypeScript environment, you can use a CLI tool:

bash

npx openapi-mcp-generator --input ./api-spec.yaml --output ./mcp-server

In the Python ecosystem, FastMCP's OpenAPI integration is the fastest option:

python

from fastmcp import FastMCP
from your_internal_api import app  # existing FastAPI app
 
# Convert FastAPI app to an MCP server
mcp = FastMCP.from_fastapi(app=app)
 
if __name__ == "__main__":
    mcp.run()

One thing worth being honest about: writing from your_internal_api import app on one line looks simple, but in reality it brings along all the dependencies, environment variables, DB connections, etc. of that FastAPI app. If you can run it in the same Python environment as the existing app, this is the fastest approach; if the environments are separated, it's more practical to export the OpenAPI spec file and process it with openapi-mcp-generator.

The operationId, summary, and description from your OpenAPI spec map directly to Tool names and descriptions. Since spec quality becomes MCP Tool quality, this is especially effective for teams that already maintain a well-managed OpenAPI spec.

Scaling to Team Size: The MCP Gateway Pattern

When you have multiple teams and each domain starts running its own MCP server, authentication and audit logs become fragmented. Honestly, at this point the thought crosses your mind — "can't we just hardcode tokens in each server?" — but that choice comes back to bite you during a security audit.

The MCP Gateway is a pattern that solves this problem. The Gateway acts as a reverse proxy at the MCP protocol level. When an AI agent connects to the single endpoint (the Gateway), the Gateway handles OAuth 2.1 authentication and then internally routes Tool calls to each domain MCP server. The domain servers don't need to be exposed to the internet — they only need to exist behind the Gateway.

[Claude / AI Agent]
        |
        ↓ (single connection point)
[MCP Gateway]
  - OAuth 2.1 authentication handling
  - Rate limiting
  - Audit logs (records all Tool calls)
  - Routes to domain-specific servers
        |
   _____|_____
  |     |     |
  ↓     ↓     ↓
[HR   [CI/CD [Analytics
 API]  API]   API]
 MCP   MCP    MCP
 Srv   Srv    Srv
(internal network only)

Tools validated by real-world use cases:

Tool	Characteristics
Kong AI MCP Proxy	Bridges existing HTTP APIs to MCP; integrated rate limiting and authentication
Azure API Management + Entra ID	MCP + AD federation in Microsoft stack environments
mcp-gateway-registry	Open-source gateway registry with Keycloak/Entra integration

Behind the roughly 970x growth in MCP SDK monthly downloads over 18 months is this kind of enterprise proliferation of the Gateway pattern. As of 2026, this pattern is becoming the de facto standard for in-house AI infrastructure at team scale and above.

Closing Thoughts

Wrapping internal APIs with MCP creates infrastructure that lets your entire team interact with internal systems through natural language via AI agents, while leaving existing code untouched.

Here are 3 steps you can start with right now. Choose based on your situation:

If your internal API has an OpenAPI spec, you can generate skeleton code first with npx openapi-mcp-generator --input ./your-api-spec.yaml --output ./mcp-server. In this case, you can skip step 2 (manual Tool writing).
If you don't have a spec, install pnpm add @modelcontextprotocol/sdk zod axios and use the TypeScript example above as a reference to convert one API your team uses daily (ticket creation, deployment status check, etc.) into a single Tool. Even one Tool is enough to experience an agent turning a natural-language instruction into an actual API call.
Connect a local server to Claude Code or Cursor's MCP settings and use it directly. Watching which parameters the agent fills when it calls a Tool will immediately reveal what needs to be improved in your schema. After going through this step, you'll feel firsthand why describe() and enums matter.

References

Recommended starting points:

For deeper learning:

#MCP#TypeScript#REST-API#LLM#Gateway패턴#OpenAPI#OAuth2#Zod#AI에이전트#JSON-RPC

How to Make LLMs Directly Call Your Internal REST APIs: TypeScript MCP Server Implementation and the Gateway Pattern

With Claude Code, Cursor, and GitHub Copilot all operating as MCP clients today, exposing internal systems via MCP is also an exercise in building AI productivity infrastructure for your entire team.

Core Concepts

What Is MCP Wrapping?

Terminology: JSON-RPC is a remote procedure call protocol that uses JSON. MCP builds on this to define a standard communication method between LLMs and external tools.

MCP provides three basic abstractions:

Abstraction	Role	Internal API Mapping Example
Tool	A function through which the LLM triggers side effects or requests computation	`createTicket`, `triggerDeploy`
Resource	Read-only data exposure	Internal wiki, API docs, codebase
Prompt	Reusable prompt templates	Domain-specific instructions

Transport: `stdio` vs Streamable HTTP

Transport choice determines how the server is deployed. As of March 2025, the legacy HTTP+SSE approach has been officially deprecated, leaving two options:

Transport	Use Case	Characteristics
`stdio`	Local development, single-user CLI tools	Standard I/O between processes, simple setup
Streamable HTTP	Team- or org-scale remote servers	HTTP-based, OAuth 2.1 required, scalable

The Key to Schema Design

Before — exposing nested objects directly (LLM frequently miscalls):

typescript

{
  metadata: z.object({
    labels: z.array(z.object({
      key: z.string(),
      value: z.string()
    }))
  })
}

After — flattened (LLM fills in accurately):

typescript

{
  label_key: z.string().describe("Label key (e.g., env, team)"),
  label_value: z.string().describe("Label value (e.g., production, backend)")
}

The principles for a good Tool schema in summary:

Flatten parameters as much as possible — simple fields over nested objects
Use describe() to specify the meaning and allowed values of each parameter
Make active use of enum types to constrain the LLM's choices

Pros and Cons Analysis

It's worth examining the tradeoffs before deciding to adopt MCP wrapping.

Advantages

Item	Description
Fast integration	Add only an MCP interface layer without rewriting existing API logic
Multi-client support	Callable identically from any MCP-compatible client: Claude, GPT, Copilot, Cursor, etc.
OpenAPI automation	Minimize manual Tool writing if you already have a spec
Standard security layer	OAuth 2.1-based authentication enforced at the protocol level
Agent productivity	Automate complex internal tasks with natural-language instructions

Disadvantages and Caveats

Item	Description	Mitigation
Prompt injection	Malicious instructions in Tool descriptions or return values can cause the LLM to behave unintentionally	Design with the principle of not trusting returned data; output sanitization
Context pollution	Too many Tools fill the LLM context window with Tool descriptions, degrading performance	Separate servers by domain; consider dynamic Tool loading
Schema complexity	Complex internal API objects require redesign to simplify	Flattened parameter structures, active use of enums
Missing audit logs	Without records of AI agent API calls, compliance auditing is impossible	Log all calls at the MCP server or gateway level
Over-privileged access	A single MCP server holding broader API access than necessary	Apply the principle of least privilege per Tool

Terminology: A Tool Poisoning Attack is an attack method that embeds malicious LLM instructions into an MCP Tool's description or return value to cause the agent to perform unintended actions. Because data coming from outside (user input, external API responses) can appear in Tool return values, this is a threat that can't be ignored even in internal systems.

The Most Common Mistakes in Practice

Writing lazy Tool descriptions — A one-liner like "creates a ticket" makes it hard for the LLM to understand when and how to use the Tool. Using describe() to specify concrete examples and allowed values for each parameter directly determines Tool call accuracy.
Accepting auth tokens as Tool parameters — Having the LLM handle tokens directly risks exposing them in logs or having them stolen via prompt injection. It's recommended to inject credentials from server environment variables and never expose them in the Tool interface.
Deploying stdio servers to the whole team as-is — Using a local stdio server as-is for team-wide use exposes internal APIs with no authentication, rate limiting, or audit logs. Once you reach team scale, switching to Streamable HTTP + OAuth 2.1 is recommended.

Practical Application

Direct TypeScript SDK Implementation: Wrapping an Internal Issue Tracker as a Tool

Wrap the REST API of Jira, Linear, or your own ticket system as an MCP Tool, and just saying "create a P1 ticket for this bug" to your editor AI will actually create the issue.

First, install the dependencies:

bash

pnpm add @modelcontextprotocol/sdk zod axios

Next, the server code. Where internalClient comes from can be confusing at first glance — it's an axios instance initialized from environment variables:

typescript

import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";
import axios from "axios";
 
// Auth token is injected from environment variables — not accepted as a Tool parameter
const internalClient = axios.create({
  baseURL: process.env.INTERNAL_API_URL,
  headers: {
    Authorization: `Bearer ${process.env.API_TOKEN}`
  }
});
 
const server = new McpServer({ name: "internal-issue-tracker", version: "1.0.0" });
 
server.tool(
  "create_ticket",
  "Creates a new ticket in the internal issue tracker",
  {
    title: z.string().describe("Ticket title (clear and concise)"),
    priority: z.enum(["P1", "P2", "P3"]).describe("Priority: P1=urgent, P2=high, P3=normal"),
    assignee: z.string().email().optional().describe("Assignee's email address")
  },
  async ({ title, priority, assignee }) => {
    const res = await internalClient.post("/issues", { title, priority, assignee });
    return {
      content: [{ type: "text", text: `Ticket created: ${res.data.id} — ${res.data.url}` }]
    };
  }
);
 
// Start server — uses stdio transport for local development
const transport = new StdioServerTransport();
await server.connect(transport);

Point	Description
`z.enum(["P1", "P2", "P3"])`	Locks the choices so the LLM cannot insert arbitrary values
`.describe()`	Communicates the meaning of each parameter directly to the LLM
`axios.create(...)`	Handles auth headers at the client level — not exposed in the Tool interface
`server.connect(transport)`	This line is required for the server to actually run

Automatic OpenAPI Spec Conversion: Building an MCP Server Without Manual Work

If your internal API already has an OpenAPI (Swagger) spec, you don't need to write Tools by hand. In a TypeScript environment, you can use a CLI tool:

bash

npx openapi-mcp-generator --input ./api-spec.yaml --output ./mcp-server

In the Python ecosystem, FastMCP's OpenAPI integration is the fastest option:

python

from fastmcp import FastMCP
from your_internal_api import app  # existing FastAPI app
 
# Convert FastAPI app to an MCP server
mcp = FastMCP.from_fastapi(app=app)
 
if __name__ == "__main__":
    mcp.run()

Scaling to Team Size: The MCP Gateway Pattern

[Claude / AI Agent]
        |
        ↓ (single connection point)
[MCP Gateway]
  - OAuth 2.1 authentication handling
  - Rate limiting
  - Audit logs (records all Tool calls)
  - Routes to domain-specific servers
        |
   _____|_____
  |     |     |
  ↓     ↓     ↓
[HR   [CI/CD [Analytics
 API]  API]   API]
 MCP   MCP    MCP
 Srv   Srv    Srv
(internal network only)

Tools validated by real-world use cases:

Tool	Characteristics
Kong AI MCP Proxy	Bridges existing HTTP APIs to MCP; integrated rate limiting and authentication
Azure API Management + Entra ID	MCP + AD federation in Microsoft stack environments
mcp-gateway-registry	Open-source gateway registry with Keycloak/Entra integration

Closing Thoughts

Wrapping internal APIs with MCP creates infrastructure that lets your entire team interact with internal systems through natural language via AI agents, while leaving existing code untouched.

Here are 3 steps you can start with right now. Choose based on your situation:

If your internal API has an OpenAPI spec, you can generate skeleton code first with npx openapi-mcp-generator --input ./your-api-spec.yaml --output ./mcp-server. In this case, you can skip step 2 (manual Tool writing).
If you don't have a spec, install pnpm add @modelcontextprotocol/sdk zod axios and use the TypeScript example above as a reference to convert one API your team uses daily (ticket creation, deployment status check, etc.) into a single Tool. Even one Tool is enough to experience an agent turning a natural-language instruction into an actual API call.
Connect a local server to Claude Code or Cursor's MCP settings and use it directly. Watching which parameters the agent fills when it calls a Tool will immediately reveal what needs to be improved in your schema. After going through this step, you'll feel firsthand why describe() and enums matter.

References

Recommended starting points:

For deeper learning:

#MCP#TypeScript#REST-API#LLM#Gateway패턴#OpenAPI#OAuth2#Zod#AI에이전트#JSON-RPC

Core Concepts

What Is MCP Wrapping?

Transport: stdio vs Streamable HTTP

The Key to Schema Design

Pros and Cons Analysis

Advantages

Disadvantages and Caveats

The Most Common Mistakes in Practice

Practical Application

Direct TypeScript SDK Implementation: Wrapping an Internal Issue Tracker as a Tool

Automatic OpenAPI Spec Conversion: Building an MCP Server Without Manual Work

Scaling to Team Size: The MCP Gateway Pattern

Closing Thoughts

References

Core Concepts

What Is MCP Wrapping?

Transport: stdio vs Streamable HTTP

The Key to Schema Design

Pros and Cons Analysis

Advantages

Disadvantages and Caveats

The Most Common Mistakes in Practice

Practical Application

Direct TypeScript SDK Implementation: Wrapping an Internal Issue Tracker as a Tool

Automatic OpenAPI Spec Conversion: Building an MCP Server Without Manual Work

Scaling to Team Size: The MCP Gateway Pattern

Closing Thoughts

References

Recommended Posts

Type-Safe LLM Response Validation with Pydantic AI

Cutting Long-Horizon Agent Costs by 60–90%: Caching, Compression, and Routing Strategies

AI Writes It, AI Reviews It: Building a `/code-review ultra` Multi-Agent Pipeline

7 Major Patterns of Agentic AI Design

Open-Weight vs Closed AI 2026: Now That the Benchmark Gap Has Narrowed, the Criteria for Choosing Has Changed

Running Qwen3-Coder Locally: Setting Up an SWE-bench 70% AI Coding Agent with a Single RTX 3090

Transport: `stdio` vs Streamable HTTP

Transport: `stdio` vs Streamable HTTP