Privacy Policy© 2026 DEV BAK - TECH BLOG. All rights reserved.
DEV BAK - TECH BLOG
frontend

How to Reduce TTFB from 350ms to 60ms with Next.js RSC + Streaming

While reviewing the Core Web Vitals of a production service, I once discovered pages where TTFB was well over 400ms. Each DB query was individually fast, so I wondered why things were so slow — and it turned out that three independent queries were waiting in series. I figured parallelizing them with Promise.all would solve the problem, but the TTFB barely changed. The issue wasn't a single line of code; it was at the architecture level.

Only after properly understanding RSC + Streaming did the numbers change dramatically. After reading this article, you'll be able to directly apply a page design approach where TTFB equals "the time to render the static shell" rather than "the time the slowest query takes to finish." This is explained in the context of the Next.js App Router, and anyone who has used React and the App Router should be able to follow along right away.


Core Concepts

What's the Difference Between Traditional SSR and Streaming SSR?

Honestly, at first I thought, "How different can streaming really be?" But once I opened the Network tab and watched chunks arriving one by one, my thinking changed completely.

Traditional SSR is a structure where the server collects all data and returns a single completed HTML response. The problem lies in the word "all." If three DB queries take 50ms, 100ms, and 300ms respectively, the user doesn't receive the first byte until 300ms have elapsed. The slowest query holds the rest hostage.

Approach TTFB Determinant User Experience
Traditional SSR Completion time of the slowest data fetch White screen → full render all at once
Streaming SSR Static shell rendering time Layout displayed immediately → content appears progressively
PPR CDN edge latency (20–80ms) Cached shell instantly → dynamic sections streaming

TTFB (Time to First Byte): The time from when the browser sends a request to the server until it receives the first byte of the response. Google's Core Web Vitals recommended threshold is below 800ms, ideally below 200ms.

What RSC Actually Sends to the Client

What RSC (React Server Components) delivers is not completed HTML. It is a component tree encoded in a serialization format called the React Flight protocol. (The internal structure of this format is not covered in detail here, but if you want to dive deep, the renderToPipeableStream documentation is a good starting point.) The client runtime progressively assembles the UI as it receives stream chunks.

typescript
// app/dashboard/_components/MetricsPanel.tsx
// Without 'use client', this is a Server Component by default
async function MetricsPanel() {
  const metrics = await getMonthlyMetrics(); // Direct DB call
 
  // The db module is completely excluded from the client bundle
  return <MetricsGrid data={metrics} />;
}

This structure brings two benefits. Server-only code like db is excluded from the client bundle, and DB calls are made directly on the server, eliminating network round trips.

How Suspense Boundaries Make Streaming Work

Suspense was originally created for code splitting, but it has taken on a central role in server streaming.

At the HTTP level, it works like this. The server opens a Transfer-Encoding: chunked response and immediately flushes the static shell outside the Suspense boundary as the first chunk. While the component inside the boundary waits for data, a fallback UI (such as a skeleton) is sent along with it. When the data is ready, the server sends the actual content for that boundary as an additional chunk, and the client runtime replaces the skeleton.

tsx
// app/dashboard/page.tsx
export default function DashboardPage() {
  return (
    <DashboardLayout>
      {/* DashboardLayout is static — included in the first chunk immediately */}
 
      <Suspense fallback={<MetricsSkeleton />}>
        <MetricsPanel />  {/* Sent as a separate chunk when the internal fetch completes */}
      </Suspense>
    </DashboardLayout>
  );
}

Without Suspense, the server sends nothing until MetricsPanel's data is available. With Suspense, the layout is delivered immediately and the data is filled in later.


Practical Application

Here are three examples arranged in order of complexity. It's best to start with whichever one most closely resembles your current situation.

Example 1: Data-Heavy Dashboard — Parallel Streaming

A dashboard where multiple widgets each require independent data is the type where RSC Streaming effects are most pronounced. Wrapping each widget in an independent Suspense boundary enables parallel streaming.

tsx
// app/dashboard/page.tsx
export default function DashboardPage() {
  return (
    <DashboardLayout>
      {/* DashboardLayout is static — flushed immediately on request */}
 
      <Suspense fallback={<MetricsSkeleton />}>
        <MetricsPanel />          {/* DB query A: ~100ms */}
      </Suspense>
 
      <Suspense fallback={<ChartSkeleton />}>
        <RevenueChart />          {/* DB query B: ~150ms */}
      </Suspense>
 
      <Suspense fallback={<TableSkeleton />}>
        <RecentTransactions />    {/* DB query C: ~300ms */}
      </Suspense>
    </DashboardLayout>
  );
}
tsx
// app/dashboard/_components/MetricsPanel.tsx
// Aggregated data — cacheable, Server Component
async function MetricsPanel() {
  const metrics = await getMonthlyMetrics();
  return <MetricsGrid data={metrics} />;
}
Component Role When Sent
DashboardLayout Navigation, sidebar Immediately on request
<MetricsSkeleton /> etc. Loading placeholders Together with Layout
<MetricsPanel /> Aggregated metrics After query A completes
<RecentTransactions /> Transaction list After query C completes

Queries A, B, and C run in parallel without waiting for each other. With traditional SSR the TTFB would be 300ms+, but with this structure the layout is delivered immediately. No matter how slow query C is, the other widgets appear first.

Example 2: E-commerce Product Detail Page — Separating Static/Dynamic with PPR

Product pages have an interesting structure. The product name, images, and description rarely change, but inventory and pricing require real-time data. PPR (Partial Prerendering) separates these two at build time.

The reason ProductHero and ProductDescription are classified as static is that they can be generated at build time using only params.id without any external data fetching. By contrast, LivePrice disables caching with noStore(), so it is fetched from the origin server on every request. This difference is the branching point for caching strategy.

typescript
// next.config.ts
export default {
  experimental: { ppr: 'incremental' },
};
tsx
// app/product/[id]/page.tsx
export default function ProductPage({ params }: { params: { id: string } }) {
  return (
    <>
      {/* Static shell — generated at build time, served from CDN */}
      <ProductHero id={params.id} />
      <ProductDescription id={params.id} />
 
      {/* Dynamic sections — streamed from origin */}
      <Suspense fallback={<PriceSkeleton />}>
        <LivePrice id={params.id} />
      </Suspense>
 
      <Suspense fallback={<StockSkeleton />}>
        <StockStatus id={params.id} />
      </Suspense>
    </>
  );
}
tsx
// app/product/[id]/_components/LivePrice.tsx
import { unstable_noStore as noStore } from 'next/cache';
 
async function LivePrice({ id }: { id: string }) {
  noStore(); // Disable caching — fetch the latest price on every request
  const price = await fetchCurrentPrice(id);
  return <PriceDisplay price={price} />;
}

Once this structure is in place, ProductHero is delivered from a CDN edge within 20–80ms, and price and inventory information streams immediately after. Origin server processing time has zero impact on TTFB.

PPR (Partial Prerendering): A Next.js feature that generates both a static HTML shell and a postponedState blob at build time. On request, the CDN immediately sends the shell while the origin server simultaneously streams only the dynamic sections. It is evolving toward stabilization under the experimental.ppr flag.

Example 3: Preventing Waterfalls — The Promise Early-Start Pattern

I made this mistake for quite a while early on. It's the case where you carefully separate Suspense boundaries, but then create a waterfall inside the component itself. This pattern comes up frequently:

typescript
// Bad example: serial waterfall — takes 500ms total
// app/profile/[userId]/page.tsx
async function UserProfilePage({ params }: { params: { userId: string } }) {
  const user = await getUser(params.userId);        // Wait 300ms
  const posts = await getUserPosts(user.id);         // Then wait another 200ms
  // user.id === params.userId, yet we unnecessarily wait for user before starting posts
  return <Profile user={user} posts={posts} />;
}

Because getUserPosts receives user.id, it appears there's a dependency — but in reality, passing params.userId directly makes the two fetches completely independent.

typescript
// Good example: Promise early start — takes 300ms total
// app/profile/[userId]/page.tsx
async function UserProfilePage({ params }: { params: { userId: string } }) {
  const userPromise = getUser(params.userId);        // Start immediately
  const postsPromise = getUserPosts(params.userId);  // Start immediately (no waiting for user)
 
  const [user, posts] = await Promise.all([userPromise, postsPromise]);
  return <Profile user={user} posts={posts} />;
}

Simply restructuring the code like this saves 200ms. Even though the code change looks small, it makes a noticeably meaningful difference in real user experience.


Pros and Cons Analysis

Advantages

Item Description
Reduced TTFB Static shell is sent immediately, breaking away from the "slow query = slow TTFB" equation
Reduced bundle size Server Component code is completely excluded from the client bundle
Fetching close to the data source DB and internal APIs are called directly on the server, eliminating network round trips
SEO compatibility Search engines can index content even during streaming
Parallel data fetching Independent Suspense boundaries fetch data in parallel

Disadvantages and Caveats

The most common problems when first adopting this are reverse proxy buffering and error handling. I personally experienced a situation where streaming that worked fine locally didn't work at all in production because of an Nginx configuration issue. If chunks appear to arrive all at once in the Network tab, the proxy settings are the first thing worth investigating.

Item Description Mitigation
Blocking fetches A top-level page await neutralizes streaming Move data-dependent sections inside Suspense boundaries
Error handling complexity Errors after shell flush cannot change the HTTP status code ErrorBoundary is required; design upstream error recovery logic
Reverse proxy buffering Nginx default settings buffer chunks and block streaming proxy_buffering off; proxy_cache off; configuration is required
Flight payload bloat Large Server→Client props can delay hydration Minimize prop passing, review serialization of large data
Environment differences Streaming behavior may differ between local and production Real-world measurement via RUM is essential
Suspense design overhead Too granular boundaries can cause dozens of loading UIs to appear simultaneously Design with both information architecture and data dependencies in mind

RUM (Real User Monitoring): Performance data collected from real users' browsers. It can be gathered with the @vercel/speed-insights or web-vitals libraries. Local measurements can differ meaningfully from production and are difficult to rely on.

The Most Common Mistakes in Practice

  1. Blocking global data with await in layout components — Fetching session or user information with await in app/layout.tsx increases the TTFB of all child pages by that duration. If data is truly required in the layout, consider wrapping it in Suspense or moving it to the client side.

  2. Concluding "streaming doesn't work" while leaving Nginx settings unchanged — If chunks don't appear separated in the Network tab, it's worth checking the proxy_buffering off; proxy_cache off; settings first. It's likely not a code problem.

  3. Confusion from trying to attach client logic to Server Components — Where you place the 'use client' boundary in the RSC tree significantly affects bundle size. It's important to develop the habit of isolating only the parts that require client state or event handlers.


Closing Thoughts

The biggest shift in mindset from applying RSC + Streaming was realizing that performance improvements come from "correct architectural boundaries" rather than "faster code." Optimizing queries is important, but structuring things so that slow queries cannot block fast layouts is what actually moved the TTFB numbers.

Here's the order I applied these changes in, and I was able to see visible improvements at each step.

  1. The first thing to do is find the slowest data fetch in an existing page and wrap it with <Suspense fallback={<Skeleton />}>. This alone prevents that section from blocking the rest of the rendering. It's worth checking the DevTools Network tab to confirm that chunks are arriving separately.

  2. If there are serial fetches inside a component, you can revisit the dependencies. Asking "does this fetch truly need the result of the previous fetch?" often reveals more cases than expected where fetches can be combined with Promise.all.

  3. To compare metrics before and after changes, it's recommended to add RUM with pnpm add @vercel/speed-insights. Even if things feel fast locally, real user data can tell a different story — and the moment you see TTFB drop from 350ms to 60ms on a graph, you'll be convinced this is the right direction.


References

  • React Server Components Streaming Performance Guide 2026 | SitePoint
  • 8 Next.js Streaming Tactics That Slash TTFB | Medium (Neurobyte)
  • The Ultimate Guide to Improving Next.js TTFB: From 800ms to <100ms | CatchMetrics
  • Partial Prerendering (PPR) in Production: Architecture Patterns (2026 Edition) | samcheek.com
  • How We Reduced TTFB by 60% Using Server Actions in Next.js 15 | Medium
  • 6 React Server Component Performance Pitfalls in Next.js | LogRocket Blog
  • How to Avoid Waterfalls in React Suspense | sergiodxa
  • Guides: Streaming | Next.js Official Docs
  • Guides: PPR Platform Guide | Next.js Official Docs
  • How Streaming Helps Build Faster Web Applications | Vercel Blog
  • React Server Components in Production: Benefits, Pitfalls and Best Practices 2026 | Growin
  • Mastering Next.js 15 Streaming and Suspense: A Performance Guide | untergletscher.com
#NextJS#RSC#Streaming#Suspense#PPR#ServerComponents#TTFB#CoreWebVitals#React#성능최적화
Share

Table of Contents

Core ConceptsWhat's the Difference Between Traditional SSR and Streaming SSR?What RSC Actually Sends to the ClientHow Suspense Boundaries Make Streaming WorkPractical ApplicationExample 1: Data-Heavy Dashboard — Parallel StreamingExample 2: E-commerce Product Detail Page — Separating Static/Dynamic with PPRExample 3: Preventing Waterfalls — The Promise Early-Start PatternPros and Cons AnalysisAdvantagesDisadvantages and CaveatsThe Most Common Mistakes in PracticeClosing ThoughtsReferences

Recommended Posts

HTMX 4.0 Server Rendering Patterns: Architecture Choices for Building Interactive Web Apps Without Client State
frontend

HTMX 4.0 Server Rendering Patterns: Architecture Choices for Building Interactive Web Apps Without Client State

When I first learned React, I honestly buried this question in the back of my mind: "Does clicking a single button really require this much code?" State managem...

June 7, 202622 min read
View Transitions API — Production Page Transition Animations Without Libraries, After Achieving Baseline 2025
frontend

View Transitions API — Production Page Transition Animations Without Libraries, After Achieving Baseline 2025

Honestly, my first reaction when I saw this API was "this actually works?" The idea of implementing app-like page transitions with just a few lines of CSS and a...

June 7, 202620 min read
We Migrated from Webpack to Rsbuild and Production Builds Got 74% Faster — The Migration Reality and Rspack Pitfalls
frontend

We Migrated from Webpack to Rsbuild and Production Builds Got 74% Faster — The Migration Reality and Rspack Pitfalls

Honestly, my first reaction was "another new build tool?" — same as when Vite came out, same as when Turbopack was announced. But after seeing the case from Ala...

June 23, 202619 min read
Speed Up `next build` 10× with Next.js Turbopack Build Cache — Pitfalls of Experimental Flags and CI Integration Strategies
frontend

Speed Up `next build` 10× with Next.js Turbopack Build Cache — Pitfalls of Experimental Flags and CI Integration Strategies

When in CI starts exceeding three minutes, the stress quietly accumulates. The time you spend waiting in front of the deployment pipeline every time you open a...

May 30, 202616 min read
Zustand persist migrate: 4 Ways to Safely Narrow `persistedState unknown` Type with TypeScript
frontend

Zustand persist migrate: 4 Ways to Safely Narrow `persistedState unknown` Type with TypeScript

When persisting state to localStorage with Zustand, there inevitably comes a moment when you need to change the schema. Renaming fields, adding new ones, flatte...

May 30, 202619 min read
TanStack Query + Zustand: Patterns and Anti-Patterns for Separating Server State and Client State
frontend

TanStack Query + Zustand: Patterns and Anti-Patterns for Separating Server State and Client State

Looking back at the days of using Redux, I remember API responses and modal open/close states all jumbled together in a single store. There were plenty of days ...

May 24, 202619 min read