TanStack Query v5 prefetchQuery + Next.js App Router: Eliminating Initial Loading with Render-as-You-Fetch and Streaming SSR
Ever since switching to the Next.js App Router, I kept running into the same problem. I was thrilled that Server Components made data fetching so much cleaner — but the moment I used useQuery in a component that needed client-side interactivity, a loading spinner would inevitably appear. On an e-commerce product page where the user greeting shows up in 50ms but the recommended product list is stuck waiting on an 800ms API, causing the layout to jank — that's something anyone who's spent time with the Next.js App Router has probably experienced at least once.
This post covers how to fundamentally eliminate that problem by combining TanStack Query v5's prefetchQuery with Next.js App Router's streaming SSR. By building a pipeline that pre-populates the cache on the server and transfers it to the client, users encounter a fully rendered UI immediately — without ever going through a loading state. I'll walk through this at a production-ready level, drawing from my experience applying it directly to an e-commerce page with hundreds of thousands of MAU and achieving meaningful LCP improvements as a result.
The target audience is frontend developers already using the Next.js App Router. I'll assume you're familiar with the distinction between Server Components and Client Components, and with the basic behavior of React Suspense. Examples 1, 2, and 3 increase in complexity in that order — if you want to pick the pattern you'll adopt right now, take a look at the "Which example should I choose?" guide below first.
Core Concepts
Fetch-on-Render vs Render-as-You-Fetch — What's the difference?
The traditional approach, Fetch-on-Render, starts the data request after the component mounts. The browser parses the JS, React renders the component, and only after useEffect or useQuery fires does the network request actually go out. In the meantime, the user sees a spinner.
Render-as-You-Fetch starts fetching data at the moment of route entry — before the component renders. Because the UI can be progressively streamed as data becomes ready, users encounter meaningful content much sooner instead of a blank screen. I didn't fully grasp why this was different at first, but it clicked when I saw a page with a 50ms API and an 800ms API side by side: the first appeared almost instantly, while the second showed a skeleton before the data came in.
Request Waterfall: The phenomenon where sequential delays accumulate — component A renders, requests its data, and only after component B renders from that data does B's request go out. This causes serious performance degradation in deeply nested component structures.
Three Core Building Blocks
This pattern is made up of three parts. When you first encounter them, it's easy to get confused about what dehydrate and HydrationBoundary each do — but once you understand their roles, the overall flow becomes much clearer.
| Component | Role | Where it runs |
|---|---|---|
prefetchQuery |
Pre-populates the QueryClient cache before components render | Server (Server Component) |
dehydrate |
Serializes the server cache into a form that can be transferred to the client | Server |
HydrationBoundary |
Receives the serialized cache and restores it in the client QueryClient | Client |
When these three are connected, the following pipeline is complete:
Server Component (server)
└─ prefetchQuery() ──┐
▼
QueryClient cache populated
│
dehydrate() serialization
│
Streaming HTML response (chunked transfer)
│
Client HydrationBoundary
│
Cache restored (no re-fetch)
│
useQuery() returns data immediatelyDehydrating pending queries — the game changer in v5.40.0
Honestly, before this feature arrived, it was hard to truly take advantage of streaming. Previously, you had to await all prefetchQuery calls before dehydrate could transfer them to the client. In practice, that meant streaming was blocked until the slowest query finished.
Starting with TanStack Query v5.40.0, you can dehydrate queries that are still in a pending state. If you kick off a query on the server without await, it resolves on the client during streaming. This is especially impactful on pages where slow and fast APIs are mixed together.
// Before v5.40.0 — streaming waits until all prefetches complete
await queryClient.prefetchQuery({ queryKey: ['posts'], queryFn: fetchPosts })
await queryClient.prefetchQuery({ queryKey: ['user'], queryFn: fetchUser })
// HTML transfer only begins after the slower one finishes
// After v5.40.0 — kick off with void and stream immediately
void queryClient.prefetchQuery({ queryKey: ['posts'], queryFn: fetchPosts })
void queryClient.prefetchQuery({ queryKey: ['user'], queryFn: fetchUser })
// HTML transfer begins, each query streams as a chunk when it resolvesThe
voidkeyword: An explicit declaration that you're intentionally ignoring the Promise. Calling it withoutawaitcauses TypeScript to warn about a "floating promise." Prefixing withvoidsignals in the code that you're deliberately choosing not to wait.
Practical Application
Which example should I choose?
| Situation | Recommended pattern |
|---|---|
| You want explicit control over prefetching for a specific route | Example 1 (basic pattern) |
| A single page has a mix of fast and slow APIs | Example 2 (parallel streaming) |
| You want to minimize boilerplate per route and are comfortable with experimental packages | Example 3 (experimental pattern) |
Example 1: Dashboard Page — Basic prefetchQuery + HydrationBoundary Pattern
This is the most commonly used pattern and has proven stability. It can be applied to most data-fetching pages — dashboards, product detail pages, user profiles, and more. Let me walk through the full flow, starting from the API layer.
// lib/api.ts — shared fetch wrapper for server and client
export interface DashboardStats {
revenue: number
orderCount: number
activeUsers: number
}
export async function fetchDashboardStats(): Promise<DashboardStats> {
const res = await fetch('/api/dashboard/stats', { cache: 'no-store' })
if (!res.ok) throw new Error('Failed to fetch dashboard stats')
return res.json()
}// app/dashboard/page.tsx (Server Component)
import { dehydrate, HydrationBoundary, QueryClient } from '@tanstack/react-query'
import DashboardStats from './DashboardStats'
import { fetchDashboardStats } from '@/lib/api'
export default async function DashboardPage() {
// Create a new QueryClient per request — using a singleton causes data leakage between server requests
const queryClient = new QueryClient({
defaultOptions: {
queries: {
staleTime: 60 * 1000, // 1 minute — without this, a refetch fires immediately on client mount
},
},
})
await queryClient.prefetchQuery({
queryKey: ['dashboard-stats'],
queryFn: fetchDashboardStats,
})
return (
<HydrationBoundary state={dehydrate(queryClient)}>
<DashboardStats />
</HydrationBoundary>
)
}// app/dashboard/DashboardStats.tsx (Client Component)
'use client'
import { useQuery } from '@tanstack/react-query'
import { fetchDashboardStats, type DashboardStats } from '@/lib/api'
export default function DashboardStats() {
// Uses the cache populated on the server — renders immediately on mount without a network request
const { data } = useQuery<DashboardStats>({
queryKey: ['dashboard-stats'],
queryFn: fetchDashboardStats,
})
return (
<div>
<p>Revenue: {data?.revenue.toLocaleString()}</p>
<p>Orders: {data?.orderCount.toLocaleString()}</p>
</div>
)
}| Code point | Explanation |
|---|---|
new QueryClient() placement |
Created fresh per request, inside the Server Component function. Never use a global singleton |
staleTime: 60 * 1000 |
Without it, a refetch fires immediately after client mount, negating the benefits of prefetching |
Matching queryKey |
The keys in prefetchQuery (server) and useQuery (client) must be exactly identical for the cache to connect |
HydrationBoundary wrapping |
The boundary that injects the server cache into the client tree. Components inside benefit from it |
Example 2: Parallel Streaming — Don't Let a Slow API Hold You Back
This is a situation you'll encounter constantly in real-world work. If the user greeting takes 50ms and the recommended product list takes 800ms, the old approach would make you wait the full 800ms for everything. The void prefetchQuery pattern solves this.
// app/shop/page.tsx (Server Component)
import { dehydrate, HydrationBoundary, QueryClient } from '@tanstack/react-query'
import { Suspense } from 'react'
import ProductList from './ProductList'
import UserGreeting from './UserGreeting'
import { ProductsSkeleton, UserSkeleton } from './skeletons'
import { fetchProducts, fetchUser } from '@/lib/api'
export default async function ShopPage() {
const queryClient = new QueryClient({
defaultOptions: { queries: { staleTime: 60 * 1000 } },
})
// Kick off in parallel without await — both queries start fetching on the server simultaneously
void queryClient.prefetchQuery({
queryKey: ['products'],
queryFn: fetchProducts, // slow API taking 800ms
})
void queryClient.prefetchQuery({
queryKey: ['user'],
queryFn: fetchUser, // fast API taking 50ms
})
return (
<HydrationBoundary state={dehydrate(queryClient)}>
<Suspense fallback={<UserSkeleton />}>
<UserGreeting /> {/* streams immediately after 50ms */}
</Suspense>
<Suspense fallback={<ProductsSkeleton />}>
<ProductList /> {/* streams after 800ms, shows Skeleton until then */}
</Suspense>
</HydrationBoundary>
)
}// app/shop/UserGreeting.tsx (Client Component)
'use client'
import { useSuspenseQuery } from '@tanstack/react-query'
import { fetchUser, type User } from '@/lib/api'
export default function UserGreeting() {
const { data } = useSuspenseQuery<User>({
queryKey: ['user'],
queryFn: fetchUser,
})
return <p>Hello, {data.name}!</p>
}// app/shop/ProductList.tsx (Client Component)
'use client'
import { useSuspenseQuery } from '@tanstack/react-query'
import { fetchProducts, type Product } from '@/lib/api'
export default function ProductList() {
// useSuspenseQuery — suspends if no data, returns immediately if data is present
// data type is always inferred as non-null, so it's safe without type assertions
const { data } = useSuspenseQuery<Product[]>({
queryKey: ['products'],
queryFn: fetchProducts,
})
return (
<ul>
{data.map(product => (
<li key={product.id}>{product.name}</li>
))}
</ul>
)
}
useSuspenseQueryvsuseQuery: WithuseQuery, you handleisLoadingandisErrorstates directly inside the component. WithuseSuspenseQuery, control is handed off to the nearest<Suspense>boundary when there's no data, and it returns immediately when there is. Thedatatype is also always inferred as non-null, making the code much cleaner. However, there must be a<Suspense>boundary somewhere in the ancestor tree.
Example 3: Experimental Pattern — Reducing Boilerplate with ReactQueryStreamedHydration
If repeating prefetchQuery → dehydrate → HydrationBoundary for every route feels tedious, you can look into the @tanstack/react-query-next-experimental package. It makes streaming SSR work with just useSuspenseQuery, without the manual wiring.
One important note: never declare QueryClient as a module-level singleton. In a server environment, this causes data leakage between requests. You must use the factory function pattern below to create a new instance per request.
// lib/query-client.ts — per-request QueryClient factory
import { QueryClient } from '@tanstack/react-query'
export function makeQueryClient() {
return new QueryClient({
defaultOptions: {
queries: {
staleTime: 60 * 1000,
},
},
})
}// app/layout.tsx
'use client'
import { ReactQueryStreamedHydration } from '@tanstack/react-query-next-experimental'
import { QueryClientProvider } from '@tanstack/react-query'
import { useState } from 'react'
import { makeQueryClient } from '@/lib/query-client'
export default function RootLayout({ children }: { children: React.ReactNode }) {
// useState pins the instance — prevents it from being recreated on every render
const [queryClient] = useState(() => makeQueryClient())
return (
<QueryClientProvider client={queryClient}>
<ReactQueryStreamedHydration>
{children}
</ReactQueryStreamedHydration>
</QueryClientProvider>
)
}// In any Client Component — streaming SSR works without manual prefetch
'use client'
import { useSuspenseQuery } from '@tanstack/react-query'
import { fetchProducts, type Product } from '@/lib/api'
export function ProductList() {
const { data } = useSuspenseQuery<Product[]>({
queryKey: ['products'],
queryFn: fetchProducts,
})
return <ul>{data.map(p => <li key={p.id}>{p.name}</li>)}</ul>
}Note that this package is experimental. Before adopting it in production, be sure to review the caveats in the drawbacks section below.
Pros and Cons
Advantages
| Item | Details |
|---|---|
| Eliminates initial loading state | The cache is pre-populated on the server, so users never see a spinner |
| Prevents Request Waterfalls | Parallel fetching at the server stage eliminates the sequential delays of nested components |
| SEO improvement | Fully rendered HTML is served to crawlers, benefiting search visibility |
| Client cache reuse | After hydration, the same useQuery continues to support automatic refetch, interval updates, and all other TanStack Query features |
| Progressive streaming | Critical UI is delivered first; slow data is separated by Suspense boundaries and revealed incrementally |
Drawbacks and Caveats
| Item | Details | Applies to | Mitigation |
|---|---|---|---|
| Increased boilerplate | QueryClient → prefetchQuery → dehydrate → HydrationBoundary repeated per route |
Examples 1, 2 | Abstract with a makeQueryClient() helper and shared wrapper utility |
| Hydration mismatch | If a component unsuspends before streaming completes, a server-client structural mismatch error can occur | Example 3 only | In stable-version packages, keep the await prefetchQuery pattern |
| Client navigation limitation | The no-prefetch approach solves initial loading but waterfalls reappear on CSR page transitions | Example 3 only | Use explicit prefetchQuery for important CSR transition paths |
| Serialization constraint | Data transferred via dehydrate must be purely serializable objects |
Examples 1, 2, 3 | Register queryFn function references separately on the client |
| Server-side data leakage | Using QueryClient as a global singleton can mix data between different users' requests |
Examples 1, 2, 3 | Always create via makeQueryClient() factory per request |
In practice, the most painful issue wasn't boilerplate — it was forgetting staleTime. I had clearly set up a prefetch, but then checked the Network tab on the client and saw the request going out again. Because staleTime defaults to 0, the data is immediately considered stale on mount and triggers a refetch. One missing config line renders the entire prefetch pointless, so this one deserves extra attention.
The Most Common Real-World Mistakes
-
Declaring
QueryClientas a module-level singleton — In a server environment, request A's data can linger and pollute request B. Always use themakeQueryClient()factory to create a new instance per request. -
Not setting
staleTime— The default is 0, so data is immediately considered stale on client mount and a refetch fires. The benefits of server prefetching are completely lost. It's recommended to setstaleTime: 60 * 1000or higher indefaultOptions. -
Using different
queryKeyvalues on server and client — Keys containing dynamic parameters like['products', userId]must be exactly identical on both sides for the cache to connect. It's safer to extract keys as constants or create a query options factory function. -
Using
useSuspenseQuerywithout a<Suspense>boundary — There must be a<Suspense>boundary somewhere in the ancestor tree. Without it, a runtime error occurs, and the error message alone makes it hard to find the cause. This behavior is unintuitive at first — I read the official docs two or three times on this point before it clicked.
Closing Thoughts
The combination of prefetchQuery + HydrationBoundary + Suspense streaming is becoming the standard data-fetching architecture for the Next.js App Router. In particular, v5.40.0's ability to dehydrate pending queries made it possible to use the void prefetchQuery pattern to progressively reveal UI in a natural way based on API response times — and the difference in user experience is noticeable.
Here are three steps you can take right now to get started.
-
Pick the slowest page in your existing project and apply the basic pattern. Install v5 with
pnpm add @tanstack/react-query@^5, then addprefetchQuery→dehydrate→HydrationBoundaryin that order in your Server Component. Your existinguseQuerycode will continue to work as-is. -
Try splitting your Suspense boundaries along data dependency lines. Wrap fast and slow data in separate
<Suspense>boundaries and you can experience parallel streaming with thevoid prefetchQuerypattern right away. You can confirm that the HTML response is being split into chunks in Chrome DevTools' Network tab, and you can measure the before-and-after improvement numerically with Lighthouse or Web Vitals (LCP, FCP). -
Centralize
makeQueryClient()inlib/query-client.tsto make adding future routes much lighter. Keep the per-request QueryClient creation logic in one place, and adding a new route takes just two lines. The TanStack Query official docs' Advanced Server Rendering guide has well-organized examples worth referencing.
Next post: Implementing infinite scroll streaming SSR with TanStack Query's
prefetchInfiniteQueryand React'suse()hook
References
- Advanced Server Rendering | TanStack Query Official Docs
- Performance & Request Waterfalls | TanStack Query Official Docs
- Next.js App Router Prefetching Example | TanStack Query
- Next.js Suspense Streaming Example | TanStack Query
- The Next.js 15 Streaming Handbook | freeCodeCamp
- Next.js App Router Streaming Official Docs | Next.js
- Building a Fully Hydrated SSR App with Next.js App Router and TanStack Query | Medium
- Combining React Server Components with react-query | Frontend Masters Blog
- TanStack Query & Next.js 15: The Ultimate Guide (2024) | Daniel Olawoyin
- Data Fetching in 2025: Streaming, Suspense, and Deferred Fetching | Medium