Mastra: TypeScript AI Agent Framework — Type-Safe Agent Design and Production Deployment
Honestly, when I first tried to build an AI agent in TypeScript, I was pretty lost. The Python ecosystem has well-established options like LangChain and LlamaIndex, but on the JavaScript/TypeScript side there was always that nagging feeling of "is it really okay to use this?" The options were either to abandon type safety and litter the code with any, or awkwardly wrap Python code. Then I found Mastra.
Mastra is an open-source TypeScript AI agent framework founded by developers from the team that co-founded Gatsby.js. It started at Y Combinator (W25) in 2025, shipped v1.0 in January 2026, and has since surpassed 22,000 GitHub stars, quickly establishing itself as a LangChain.js alternative in the TypeScript developer community. PayPal implemented a customer support automation agent with Mastra, and Elastic published a case study on adopting an agentic RAG pipeline integrated with Elasticsearch.
In this article, we'll explore the 6 core abstractions of Mastra and walk through concrete TypeScript code showing how to build real, type-safe agents. This is aimed at intermediate-to-advanced TypeScript developers; if you have a basic understanding of tools like Zod and pgvector, you can follow the code examples directly. If AI agents are completely new to you, start with the "What is an AI Agent?" section below to get your bearings.
Core Concepts
The 6 Primitives Mastra Provides
Mastra's design philosophy is "batteries-included." Rather than assembling AI app components piecemeal, it's designed to solve everything within one coherent framework. The backbone of that philosophy is the 6 abstractions below.
| Primitive | Role | Simple Analogy |
|---|---|---|
| Agent | An autonomous execution unit combining an LLM with tools | An employee who thinks and acts |
| Workflow | A multi-step, deterministic pipeline | A business process manual |
| Tool | An external function the agent calls | Tools and APIs an employee uses |
| Memory | Maintaining context across conversations | A notepad / long-term memory |
| RAG | Document retrieval and embedding pipeline | An internal knowledge search system |
| Evals | Agent quality evaluation | QA test scenarios |
These six work together as an interlocking system. An Agent calls Tools, uses Memory to recall previous conversations, references internal documents via RAG, and multiple Agents are coordinated through Workflows.
What Is an AI Agent? How It Differs from a Simple API Call
When you first encounter agents, it's natural to wonder "how is this different from just calling an LLM API?" I thought the same thing at first.
The key difference is the ReAct (Reasoning + Acting) loop. A regular LLM call goes input → output and that's it, but an agent, given a goal, makes its own decisions. It repeats a cycle of reasoning and acting: "What Tool do I need to achieve this goal? Based on the result of using the Tool, what should I do next?" For example, given the request "create a GitHub issue for me," the agent decides on its own what the issue title should be and what labels to apply, then calls createIssueTool. You don't need to specify the control flow in code.
Workflow vs. Agent A Workflow is a deterministic pipeline with a predefined execution order, while an Agent is an autonomous structure where the LLM itself decides which Tools to use and in what order. Workflows suit business logic where predictability matters; Agents suit interactive tasks that require flexible responses.
Vercel AI SDK + Zod = Type-Safe Agents
Mastra internally uses the Vercel AI SDK for LLM calls and streaming, and Zod to define input/output schemas.
I also initially wondered "what does type safety matter for an LLM call?" — but my thinking changed after defining Tool parameter schemas with Zod. IDE autocomplete works when the agent calls a Tool, and malformed responses are caught immediately. If you've ever experienced that awful moment when using any blows up at runtime, you'll understand.
import { Agent } from '@mastra/core/agent';
import { openai } from '@ai-sdk/openai';
import { createTool } from '@mastra/core/tools';
import { z } from 'zod';
// Tool definition — input schema specified with Zod
const getWeatherTool = createTool({
id: 'get-weather',
description: '도시 이름으로 현재 날씨(기온, 날씨 상태)를 조회합니다. 날씨 관련 질문에만 사용하세요.',
inputSchema: z.object({
city: z.string().describe('날씨를 조회할 도시 이름'),
}),
outputSchema: z.object({
temperature: z.number(),
condition: z.string(),
}),
execute: async ({ context }) => {
// 실제 날씨 API 호출 위치
return { temperature: 22, condition: '맑음' };
},
});
// Agent definition
const weatherAgent = new Agent({
name: 'weatherAgent',
instructions: '사용자의 날씨 질문에 답변하는 에이전트입니다.',
model: openai('gpt-4o'),
tools: { getWeatherTool },
});What is Zod? It's a library for runtime type validation in TypeScript. Declare a schema like
z.object({ city: z.string() })and you get an automatically generated TypeScript type along with runtime validation.
Memory: How Short-term, Long-term, and Semantic Recall Work
Mastra's memory system is divided into three layers.
- Short-term memory: The last N turns of the current session's conversation history are included directly in the context window. With
lastMessages: 20, the last 20 turns are passed to the LLM on every request. - Long-term memory: User preferences and key information are permanently stored in a database. Information the agent deems important is stored explicitly, or accumulated via conversation summarization.
- Semantic recall: Past conversations semantically similar to the current question are found via embedding vector search and added to the context. With
topK: 5, the 5 most similar conversation segments are included.
PostgreSQL, LibSQL (SQLite-compatible), and Upstash Redis are supported as storage backends.
import { Memory } from '@mastra/memory';
import { LibSQLStore } from '@mastra/memory/storage';
const memory = new Memory({
storage: new LibSQLStore({ url: 'file:local.db' }),
options: {
lastMessages: 20, // Short-term: include last 20 turns in context
semanticRecall: {
topK: 5, // Semantic: add 5 similar conversations via vector search
messageRange: 3,
},
},
});
const agentWithMemory = new Agent({
name: 'chatAgent',
instructions: '사용자를 기억하는 대화 에이전트입니다.',
model: openai('gpt-4o'),
memory,
});Practical Application
Example 1: GitHub Automation Agent
There's a situation you run into often in real work: the routine task of linking issues and notifying assignees every time a PR is opened. This can be implemented fairly quickly with a Mastra + GitHub Tool combination.
At first I crammed issue creation, PR lookup, and comment posting all into one Tool, but later, seeing the agent get confused about which Tool to use, I split them out by role. The example below is a Tool responsible only for creating issues.
import { Agent } from '@mastra/core/agent';
import { createTool } from '@mastra/core/tools';
import { anthropic } from '@ai-sdk/anthropic';
import { z } from 'zod';
const createIssueTool = createTool({
id: 'create-github-issue',
description: 'GitHub 저장소에 이슈를 생성합니다. PR 분석 결과나 버그 보고 시 사용합니다.',
inputSchema: z.object({
owner: z.string(),
repo: z.string(),
title: z.string(),
body: z.string(),
labels: z.array(z.string()).optional(),
}),
execute: async ({ context }) => {
const response = await fetch(
`https://api.github.com/repos/${context.owner}/${context.repo}/issues`,
{
method: 'POST',
headers: {
Authorization: `token ${process.env.GITHUB_TOKEN}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
title: context.title,
body: context.body,
labels: context.labels,
}),
}
);
if (!response.ok) {
throw new Error(`GitHub API 오류: ${response.status} ${response.statusText}`);
}
const issue = await response.json();
return { issueNumber: issue.number, url: issue.html_url };
},
});
const githubAgent = new Agent({
name: 'githubAgent',
instructions: `
GitHub PR과 이슈를 관리하는 에이전트입니다.
PR 내용을 분석해 적절한 이슈를 생성하고, 관련 레이블을 붙입니다.
`,
model: anthropic('claude-opus-4-7'),
tools: { createIssueTool },
});
// Run the agent
const result = await githubAgent.generate(
'feat: 사용자 인증 모듈 추가 PR에 대한 이슈를 생성해줘. 저장소는 my-org/my-repo야.'
);
console.log(result.text);| Code Point | Description |
|---|---|
Specific description |
Specifying "when to use it" raises the quality of the agent's decision-making |
if (!response.ok) |
Explicitly throwing on HTTP errors lets the agent recognize failures and retry |
inputSchema |
Zod guarantees the types of Tool parameters |
agent.generate() |
Runs the agent with natural language input |
Example 2: RAG-based Internal Document QA
An agent that answers questions based on a team's internal wiki or documentation is one of the most widely used patterns these days. Using Mastra's RAG pipeline, everything from document embedding to search to answer generation is connected in one flow.
Two factors significantly affect real-world quality: chunking strategy (how large to split documents) and embedding model selection. The example below uses paragraph-level chunking with text-embedding-3-small, but for long technical documents, 512–1024 token chunking often works better. For multilingual documents, consider text-embedding-3-large or a multilingual-specialized model.
import { MastraVector } from '@mastra/vector-pg'; // using pgvector
import { openai } from '@ai-sdk/openai';
import { createTool } from '@mastra/core/tools';
import { Agent } from '@mastra/core/agent';
import { embed } from 'ai';
import { z } from 'zod';
const vectorStore = new MastraVector({
connectionString: process.env.DATABASE_URL!,
});
// Embed and store documents (paragraph-level chunking)
async function indexDocument(content: string, metadata: Record<string, string>) {
const { embedding } = await embed({
model: openai.embedding('text-embedding-3-small'),
value: content,
});
await vectorStore.upsert({
indexName: 'docs',
vectors: [{ id: crypto.randomUUID(), vector: embedding, metadata }],
});
}
// RAG retrieval Tool
const searchDocsTool = createTool({
id: 'search-docs',
description: '사내 문서를 시맨틱 검색합니다. 팀 정책, 기술 스펙, 온보딩 자료 조회 시 사용합니다.',
inputSchema: z.object({
query: z.string().describe('검색할 내용'),
}),
execute: async ({ context }) => {
const { embedding } = await embed({
model: openai.embedding('text-embedding-3-small'),
value: context.query,
});
const results = await vectorStore.query({
indexName: 'docs',
queryVector: embedding,
topK: 5,
});
return { documents: results.map(r => r.metadata) };
},
});
const docsAgent = new Agent({
name: 'docsAgent',
instructions:
'사내 문서를 검색해 정확한 답변을 제공하는 에이전트입니다. 문서에 없는 내용은 "문서에서 확인할 수 없습니다"라고 안내합니다.',
model: openai('gpt-4o'),
tools: { searchDocsTool },
});Example 3: Multi-step Workflow for a Content Summarization Pipeline
Beyond simple agent calls, Workflows are the right tool when you need to connect multiple steps deterministically. They're a great fit for tasks with clearly defined stages — like news fetching → summarization → Slack delivery. The moment you think "the order is fixed anyway, why do I need an Agent?" — Workflow is the answer.
import { Agent } from '@mastra/core/agent';
import { createWorkflow, createStep } from '@mastra/core/workflows';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';
// Summarization agent used in step 2 (defined first)
const summaryAgent = new Agent({
name: 'summaryAgent',
instructions: '주어진 기사 목록을 간결하고 핵심만 담아 3줄로 요약합니다.',
model: openai('gpt-4o'),
});
// Step 1: Fetch RSS feed
const fetchNewsStep = createStep({
id: 'fetch-news',
inputSchema: z.object({ feedUrl: z.string() }),
outputSchema: z.object({ articles: z.array(z.string()) }),
execute: async ({ inputData }) => {
// RSS parsing logic goes here (use rss-parser or similar in practice)
const articles = ['기사 1 내용...', '기사 2 내용...'];
return { articles };
},
});
// Step 2: Generate summary (using an agent)
const summarizeStep = createStep({
id: 'summarize',
inputSchema: z.object({ articles: z.array(z.string()) }),
outputSchema: z.object({ summary: z.string() }),
execute: async ({ inputData }) => {
const result = await summaryAgent.generate(
`다음 기사들을 3줄로 요약해주세요:\n${inputData.articles.join('\n\n')}`
);
return { summary: result.text };
},
});
// Step 3: Send to Slack
const sendSlackStep = createStep({
id: 'send-slack',
inputSchema: z.object({ summary: z.string() }),
outputSchema: z.object({ sent: z.boolean() }),
execute: async ({ inputData }) => {
const response = await fetch(process.env.SLACK_WEBHOOK_URL!, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ text: inputData.summary }),
});
if (!response.ok) {
throw new Error(`슬랙 전송 실패: ${response.status}`);
}
return { sent: true };
},
});
// Assemble the Workflow
const newsPipeline = createWorkflow({
name: 'news-pipeline',
triggerSchema: z.object({ feedUrl: z.string() }),
})
.then(fetchNewsStep)
.then(summarizeStep)
.then(sendSlackStep)
.commit();
// Run the pipeline
const run = await newsPipeline.execute({
triggerData: { feedUrl: 'https://feeds.example.com/tech.rss' },
});
console.log('Pipeline complete:', run);I actually used this pattern to wire up internal newsletter automation, and the .then() chain approach kept each step cleanly separated, making debugging much easier. Seeing the intermediate step outputs directly in Mastra Studio was a big bonus too.
Pros and Cons
Advantages
| Item | Details |
|---|---|
| Type safety | Zod schemas apply across the entire surface — Tool inputs, Workflow steps, and structured outputs — catching runtime errors at compile time |
| Developer experience | Mastra Studio (localhost:4111) lets you inspect agent conversations, Tool call traces, and Workflow visualizations locally in real time. Rated 9/10 on NextBuild DX benchmarks |
| Fast deployment | Vercel integration deploys in under 90 seconds with a single vercel deploy |
| Model variety | Single interface to 94 providers and 3,300+ models (OpenAI, Anthropic, Google, Groq, Llama, etc.) |
| MCP support | Connects to the external tool ecosystem in a standardized way via the Model Context Protocol |
| Built-in memory | All three layers — short-term, long-term, and semantic recall — available out of the box, with PostgreSQL/LibSQL/Upstash backend support |
Disadvantages and Caveats
| Item | Details | Mitigation |
|---|---|---|
| Narrower ecosystem | Still growing compared to LangChain's 700+ integrations; edge runtime (Cloudflare Workers) support is also limited | Wrap needed integrations directly as Tools or use MCP servers. If edge deployment is required, consider using the Vercel AI SDK directly |
| New framework risk | v1.0 only released in January 2026 — fewer Stack Overflow answers and examples, long-term stability data still accumulating | Actively use the official Discord and GitHub Discussions. For critical production adoption, a staged rollout is recommended |
| Not suited for Python teams | TypeScript-first design doesn't fit Python-centric teams or ML experimentation environments | Python teams should consider LangChain or PydanticAI |
What is MCP (Model Context Protocol)? A standard protocol for connecting LLMs to tools, proposed by Anthropic and being adopted across the industry. Like USB-C, any LLM that supports MCP can access external tools in the same way. Mastra supports both MCP client (calling external servers) and server (exposing Mastra tools externally) modes.
The Most Common Mistakes in Practice
1. Writing a sloppy Tool description
The description is the basis on which an LLM decides which Tool to use and when. Specific descriptions like "Retrieves current weather (temperature, condition) by city name. Use only for weather-related questions." make a much bigger difference to agent behavior quality than short descriptions like "get weather."
2. Attaching too many Tools to a single Agent
When there are more than 10 Tools, the accuracy with which the LLM selects the appropriate Tool drops. Separating Agents by role and orchestrating them with Workflows or a multi-agent pattern is more stable.
3. Deploying to production without Evals
Unlike code, agent output is probabilistic. To detect regressions when changing prompts or models, you need quality evaluation for core scenarios. Mastra lets you automate this with the @mastra/evals package.
import { evaluate } from '@mastra/evals';
import { AnswerRelevancyMetric } from '@mastra/evals/llm';
const result = await evaluate(docsAgent, '팀 온보딩 절차가 어떻게 되나요?', {
metrics: [new AnswerRelevancyMetric(openai('gpt-4o'))],
});
console.log('Relevancy score:', result.metrics['AnswerRelevancy'].score);
// Range 0.0–1.0; a score below 0.7 is a signal to revisit your promptConnect this evaluation to your CI pipeline, and you'll detect score drops immediately whenever a prompt or model changes.
Closing Thoughts
After a month or two of using Mastra, the biggest change is how much less time I spend asking "why is the agent behaving like this?" I can see the Tool call flow at a glance in Mastra Studio, and Zod schemas catch type mismatches before runtime. Of course, the ecosystem still needs time to catch up to LangChain.js in breadth, and for Python teams PydanticAI is a more natural choice. But if you need to build AI agents in TypeScript, Mastra is the framework offering the most coherent developer experience right now.
Three steps to get started today:
- Run
npx create-mastra-app@latest my-agentand an interactive CLI will appear. Select your model provider (OpenAI, Anthropic, etc.) and memory backend, and the base structure is automatically generated. - Inside the generated project, run
pnpm devand Mastra Studio will open atlocalhost:4111. Try modifying a single line of code and watching the agent conversation and Tool call results update in real time — you'll get the feel for it quickly. - Once you have the basic Agent working, find an official example matching your use case (RAG, Workflow, multi-agent) in the Mastra GitHub repository and build on it.
References
- Mastra Official Site | mastra.ai — GitHub star count and official feature reference
- Mastra Official Docs | mastra.ai/docs
- Mastra GitHub Repository | github.com
- Mastra Agent Overview - Official Docs | mastra.ai
- MCP Overview - Mastra Docs | mastra.ai
- The New Stack: Mastra empowers web devs to build AI agents in TypeScript | thenewstack.io
- How I used Mastra to build a prize-winning RAG agent | LogRocket Blog — LogRocket RAG agent implementation case study
- Building agentic RAG with Mastra and Elasticsearch | Elastic Labs — Elastic production deployment case study
- Mastra Tutorial: How to Build AI Agents in TypeScript | Firecrawl
- AI Agent Framework Comparison: Mastra vs LangChain vs others | Speakeasy
- Mastra vs LangChain.js | nextbuild.co