Claude Code Ultrathink Practical Guide — Extended Thinking, /effort, Adaptive Thinking: How It Works and Usage Criteria

If you pasted ultrathink into the prompt and the response became too slow, or if you heard that /effort was newly introduced and are confused about how it differs from Ultrathink—this article may help resolve that confusion. It is true that the way AI inference depth is controlled in Claude Code has constantly changed, and that there have been few resources that accurately summarize its operating principles.

In this article, we will examine how ultrathink works internally, the actual differences made by the think, megathink, and ultrathink keywords, and how to distinguish their use from the /effort commands and Adaptive Thinking currently recommended as of 2026. After reading this article, you will be able to establish criteria to determine for yourself what level of reasoning is required for a given task.

If you are new to the Claude Code CLI, we recommend checking the Claude Code section of the official Anthropic documentation first. All examples below are written based on the Claude Code CLI terminal environment.

Key Concepts

Extended Thinking — Claude's "Thinking Inside" Method

Claude Code internally utilizes an Extended Thinking mechanism. This method allows the model to undergo step-by-step internal reasoning, much like writing down the solution process on paper, before generating the final response. This reasoning process consumes separate thinking tokens, and a higher allocation enables deeper and more sophisticated analysis.

What is Extended Thinking? It is a step-by-step reasoning process performed internally by the model before outputting the final answer. It is not exposed to the user by default, but you can view its contents directly by using the --verbose flag.

There is an important distinction here. The **ultrathink keyword is exclusive to the Claude Code CLI, but the Extended Thinking feature itself is also available in the Anthropic API. In the API, the same effect can be achieved by enabling the betas: ["interleaved-thinking-2025-05-14"] header or using the effort parameter. It is the keyword that is CLI-exclusive, not the inference mechanism itself.

Additionally, readers with an AI/ML background should keep this distinction in mind. Extended Thinking does not change the model's weights. It is a method where the same model operates with the same parameters, but the token budget is increased to perform more inference-time compute.

Thinking Level Hierarchy — Token Budget Comparison by Keyword

Keywords	Accident Budget (Token)	Suitable Task
`think`	~4,000 tokens	Simple refactoring, general debugging
`think hard` / `megathink`	~10,000 Tokens	API Design, Performance Optimization
`think harder` / `ultrathink`	~31,999 tokens	System architecture redesign, complex bug tracking

Note: Grouping think harder and ultrathink into the same budget is closer to community-observed standards than official documentation. Since actual behavior may vary by version, it is safer to understand these as "keywords that lead to deeper reasoning" rather than exact budget figures.

The larger the accident budget, the more cases the model examines and synthesizes a broader context to construct an answer. However, since response generation time also increases, it is important to select a level that matches the complexity of the task.

What is the cost of the thinking token? The thinking token is not exposed in the final response but is included in the API cost. However, since the unit price of the thinking token is set lower than that of the output token, the actual cost increase is small compared to the simple number of tokens. We recommend checking the exact unit price on the Anthropic Official Pricing Page.

2025–2026 Evolution — Why Ultrathink Disappeared and Then Returned

The history surrounding Ultrathink is short but interesting.

Early 2025: The community discovers the tangible effects of the keywords think, megathink, and ultrathink. As it spreads rapidly through GeekNews·X, the figure of a "thought budget of 31,999 tokens" draws attention.
January 2026: Anthropic enables thinking mode by default and officially deprecates ultrathink.
February–March 2026: Reports of slowed response times and degraded instruction execution quality due to heavy inference occurring even during simple tasks. Hundreds of bug reports submitted, including GitHub Issue #19098.
March 4, 2026 (v2.1.68): Anthropic recovers ultrathink keywords
As of April 2026: Adaptive Thinking adopted in Opus 4.6 and Sonnet 4.6; use of /effort command recommended.

What is Adaptive Thinking? It is a method in which the model independently determines the complexity of a query and automatically adjusts the depth of inference. Without any separate configuration, it thinks lightly for simple questions and deeply for complex requests. It has been adopted as the default mode of operation in Opus 4.6 and Sonnet 4.6.

Practical Application

Example 1: Complex Architecture Decision

Ultrathink is most effective for decisions that are difficult to reverse and have significant ripple effects, such as the transition from a monolith to microservices. An effective approach is to have Claude read the relevant files first and then request only the analysis.

# @파일명 문법으로 관련 파일을 컨텍스트에 추가합니다
@src/services/UserService.ts @src/services/OrderService.ts
 
ultrathink. 이 코드베이스를 분석하고 마이크로서비스 마이그레이션 계획을 제안해줘. 코드는 아직 작성하지 마.

Points	Description
`@파일명` Syntax	How to add related files to the context. Deep reasoning is meaningful only when there is sufficient context
`ultrathink` Location	Place before the prompt to ensure immediate recognition
"Do not write code"	Focus solely on planning to separate analysis and execution costs

Example 2: Tracking an Unreproducible Bug

It is suitable for bugs that are difficult to catch with simple log analysis, and complex error situations involving multiple systems.

# 관련 로그, 스택 트레이스, 환경 정보를 함께 제공할수록 효과적입니다
@logs/error.log @src/memory/MemoryManager.ts
 
ultrathink. 이 재현 불가능한 메모리 누수의 근본 원인을 찾아줘.

Points	Description
Provides Context	Analysis Quality Improves as Logs, Error Stacks, and Related Code Are Attached
Specify the goal	Specify the desired result specifically, such as "find the root cause"

Example 3: Legacy Code Technical Debt Analysis

# 분석 대상 파일만 명시적으로 첨부해 범위를 제어합니다
@src/legacy/AuthModule.ts @src/legacy/SessionManager.ts
 
ultrathink. 이 레거시 코드의 기술 부채를 분석하고 우선순위별 개선 로드맵을 제시해줘.

Points	Description
Priority Request	You can obtain actionable results by requesting a "priority roadmap" instead of a simple list
Scope Limitation	Control the analysis scope by explicitly attaching only legacy modules

Example 4: Set the entire session using the `/effort` command (currently the recommended method)

As of April 2026, Anthropic recommends using the /effort command over the ultrathink keyword. Once configured, it applies to all requests in the corresponding session.

# 세션 전체를 max effort로 설정
/effort max
 
# 이후 요청들은 별도 키워드 없이도 최대 추론이 적용됩니다
이 시스템의 병목 지점을 분석하고 최적화 방안을 제안해줘.

Running the same question with low and max respectively produces a noticeable difference in the response structure.

Settings	Response Attributes
`/effort low`	Concise and essentials, fast response
`/effort max`	Includes trade-off analysis, edge case review, and step-by-step justification

bash

# 수준별 선택 가이드
/effort low    # 단순 작업, 빠른 응답 우선
/effort medium # 일반적인 개발 작업
/effort high   # 복잡한 설계, 최적화
/effort max    # 최고 수준 추론 (구 ultrathink에 해당)

Method	Scope of Application	Recommended Situations
`ultrathink` Keyword	1 relevant request	When maximum incident is required for only a single specific request during a session
`/effort max`	Entire Session	When complex tasks occur in succession
Adaptive Thinking (Basic)	Automatic Model Decisions	Most Common Tasks

Example 5: Combined with Plan Mode

In Claude Code, Plan Mode can be entered via Shift+Tab. Combining Plan Mode with ultrathink allows you to create a plan with maximum reasoning power while keeping execution costs low in normal mode.

# Shift+Tab으로 Plan Mode 진입 후
ultrathink. 이 기능을 구현하기 위한 단계별 계획을 세워줘.
 
# 계획을 검토한 뒤 Shift+Tab으로 Plan Mode 해제
# 이후 실행은 일반 모드에서 진행하면 됩니다

Pros and Cons Analysis

Advantages

Item	Content
Improved Inference Quality	Deriving More Accurate and Comprehensive Results Than Simple Responses Through Step-by-Step In-Depth Analysis
Complex Problem-Solving Ability	Prominent Effectiveness in Tasks Requiring Multi-faceted Review, Such as Architecture Design and Distributed System Bugs
Explicit control available	Developers can directly choose which requests to apply Deep Thinking to
Separation of Planning and Execution	Realization of High-Quality Planning and Low Execution Costs When Combined with Plan Mode

Disadvantages and Precautions

Item	Content	Response Plan
Reduced response speed	Up to 10x response time difference between low and max	Use `think` or basic Adaptive Thinking for simple tasks
Increased Cost	API costs rise due to thinking token consumption (however, the unit price of thinking tokens is lower than that of output tokens)	Selectively applied only to high-complexity tasks
Verbose results when overused	Slow and verbose results when using Ultrathink for simple tasks	Refer to the "Error cost of $5 or more or time saved of 1 hour or more" criterion
CLI Keyword Exclusive Constraints	`ultrathink` Keyword is CLI exclusive (Extended Thinking functionality itself is available in API)	When using API, replace with the `effort` parameter or `betas` header

The Most Common Mistakes in Practice

Attaching Ultrathink alone without context — Deep thinking is meaningful only with sufficient context. Attaching relevant files using @파일명 and providing background information significantly enhances its effectiveness.
Applying Ultrathink to every request — For tasks that require little judgment, such as changing variable names or correcting simple typos, it can actually result in slow and verbose outcomes. It is recommended to use it only for tasks with high complexity and impact.
Relying solely on the 'ultrathink' keyword even after 2026 — Currently, Opus 4.6 and Sonnet 4.6 use Adaptive Thinking by default, and /effort max is a more systematic approach when complex tasks continue over a session basis. Ultrathink is best suited when maximum thinking is required only for specific single requests during a session.

In Conclusion

ultrathink is an explicit tool that "gives AI time to think more deeply," and it is most effective when used selectively for high-complexity tasks.

As of 2026, while Adaptive Thinking and the /effort command are becoming central, the keyword 'ultrathink' remains alive as a useful tool for applying maximum reasoning to a single specific request. What matters is not the name of the tool, but the sense of judging how much deep thinking is required for a given task.

The usage times can be summarized at a glance as follows:

Situation	Recommended Method
Most general development tasks	Adaptive Thinking (Default, no separate configuration required)
When complex tasks occur consecutively throughout the entire session	`/effort max`
When maximum inference is needed for only a single specific request during a session	`ultrathink` Keyword
When you want to use Extended Thinking in the API	`effort` parameter or `betas` header

3 Steps to Start Right Now:

Try typing /effort high in the Claude Code CLI. The inference depth for the entire session increases, and you can directly experience the change in response quality for subsequent requests.
Select one problem that you currently find most difficult to solve and try it with ultrathink. You can get better results by first attaching the relevant file with @파일명 and specifying the desired outcome specifically.
Try running an ultrathink request in claude --verbose mode. You can directly view the model's internal thinking process, which helps you understand how the AI actually reasons.

Next Post: How to Safely Perform Large-Scale Refactoring Using Claude Code's Plan Mode and /effort Combination

Reference Materials

Claude Code Ultrathink Practical Guide — Extended Thinking, /effort, Adaptive Thinking: How It Works and Usage Criteria | DEV BAK - 기술블로그

Claude

Claude Code Ultrathink Practical Guide — Extended Thinking, /effort, Adaptive Thinking: How It Works and Usage Criteria

Key Concepts

Extended Thinking — Claude's "Thinking Inside" Method

Thinking Level Hierarchy — Token Budget Comparison by Keyword

Keywords	Accident Budget (Token)	Suitable Task
`think`	~4,000 tokens	Simple refactoring, general debugging
`think hard` / `megathink`	~10,000 Tokens	API Design, Performance Optimization
`think harder` / `ultrathink`	~31,999 tokens	System architecture redesign, complex bug tracking

2025–2026 Evolution — Why Ultrathink Disappeared and Then Returned

The history surrounding Ultrathink is short but interesting.

Early 2025: The community discovers the tangible effects of the keywords think, megathink, and ultrathink. As it spreads rapidly through GeekNews·X, the figure of a "thought budget of 31,999 tokens" draws attention.
January 2026: Anthropic enables thinking mode by default and officially deprecates ultrathink.
February–March 2026: Reports of slowed response times and degraded instruction execution quality due to heavy inference occurring even during simple tasks. Hundreds of bug reports submitted, including GitHub Issue #19098.
March 4, 2026 (v2.1.68): Anthropic recovers ultrathink keywords
As of April 2026: Adaptive Thinking adopted in Opus 4.6 and Sonnet 4.6; use of /effort command recommended.

Practical Application

Example 1: Complex Architecture Decision

# @파일명 문법으로 관련 파일을 컨텍스트에 추가합니다
@src/services/UserService.ts @src/services/OrderService.ts
 
ultrathink. 이 코드베이스를 분석하고 마이크로서비스 마이그레이션 계획을 제안해줘. 코드는 아직 작성하지 마.

Points	Description
`@파일명` Syntax	How to add related files to the context. Deep reasoning is meaningful only when there is sufficient context
`ultrathink` Location	Place before the prompt to ensure immediate recognition
"Do not write code"	Focus solely on planning to separate analysis and execution costs

Example 2: Tracking an Unreproducible Bug

It is suitable for bugs that are difficult to catch with simple log analysis, and complex error situations involving multiple systems.

# 관련 로그, 스택 트레이스, 환경 정보를 함께 제공할수록 효과적입니다
@logs/error.log @src/memory/MemoryManager.ts
 
ultrathink. 이 재현 불가능한 메모리 누수의 근본 원인을 찾아줘.

Points	Description
Provides Context	Analysis Quality Improves as Logs, Error Stacks, and Related Code Are Attached
Specify the goal	Specify the desired result specifically, such as "find the root cause"

Example 3: Legacy Code Technical Debt Analysis

# 분석 대상 파일만 명시적으로 첨부해 범위를 제어합니다
@src/legacy/AuthModule.ts @src/legacy/SessionManager.ts
 
ultrathink. 이 레거시 코드의 기술 부채를 분석하고 우선순위별 개선 로드맵을 제시해줘.

Points	Description
Priority Request	You can obtain actionable results by requesting a "priority roadmap" instead of a simple list
Scope Limitation	Control the analysis scope by explicitly attaching only legacy modules

Example 4: Set the entire session using the `/effort` command (currently the recommended method)

As of April 2026, Anthropic recommends using the /effort command over the ultrathink keyword. Once configured, it applies to all requests in the corresponding session.

# 세션 전체를 max effort로 설정
/effort max
 
# 이후 요청들은 별도 키워드 없이도 최대 추론이 적용됩니다
이 시스템의 병목 지점을 분석하고 최적화 방안을 제안해줘.

Running the same question with low and max respectively produces a noticeable difference in the response structure.

Settings	Response Attributes
`/effort low`	Concise and essentials, fast response
`/effort max`	Includes trade-off analysis, edge case review, and step-by-step justification

bash

# 수준별 선택 가이드
/effort low    # 단순 작업, 빠른 응답 우선
/effort medium # 일반적인 개발 작업
/effort high   # 복잡한 설계, 최적화
/effort max    # 최고 수준 추론 (구 ultrathink에 해당)

Method	Scope of Application	Recommended Situations
`ultrathink` Keyword	1 relevant request	When maximum incident is required for only a single specific request during a session
`/effort max`	Entire Session	When complex tasks occur in succession
Adaptive Thinking (Basic)	Automatic Model Decisions	Most Common Tasks

Example 5: Combined with Plan Mode

# Shift+Tab으로 Plan Mode 진입 후
ultrathink. 이 기능을 구현하기 위한 단계별 계획을 세워줘.
 
# 계획을 검토한 뒤 Shift+Tab으로 Plan Mode 해제
# 이후 실행은 일반 모드에서 진행하면 됩니다

Pros and Cons Analysis

Advantages

Item	Content
Improved Inference Quality	Deriving More Accurate and Comprehensive Results Than Simple Responses Through Step-by-Step In-Depth Analysis
Complex Problem-Solving Ability	Prominent Effectiveness in Tasks Requiring Multi-faceted Review, Such as Architecture Design and Distributed System Bugs
Explicit control available	Developers can directly choose which requests to apply Deep Thinking to
Separation of Planning and Execution	Realization of High-Quality Planning and Low Execution Costs When Combined with Plan Mode

Disadvantages and Precautions

Item	Content	Response Plan
Reduced response speed	Up to 10x response time difference between low and max	Use `think` or basic Adaptive Thinking for simple tasks
Increased Cost	API costs rise due to thinking token consumption (however, the unit price of thinking tokens is lower than that of output tokens)	Selectively applied only to high-complexity tasks
Verbose results when overused	Slow and verbose results when using Ultrathink for simple tasks	Refer to the "Error cost of $5 or more or time saved of 1 hour or more" criterion
CLI Keyword Exclusive Constraints	`ultrathink` Keyword is CLI exclusive (Extended Thinking functionality itself is available in API)	When using API, replace with the `effort` parameter or `betas` header

The Most Common Mistakes in Practice

Attaching Ultrathink alone without context — Deep thinking is meaningful only with sufficient context. Attaching relevant files using @파일명 and providing background information significantly enhances its effectiveness.
Applying Ultrathink to every request — For tasks that require little judgment, such as changing variable names or correcting simple typos, it can actually result in slow and verbose outcomes. It is recommended to use it only for tasks with high complexity and impact.
Relying solely on the 'ultrathink' keyword even after 2026 — Currently, Opus 4.6 and Sonnet 4.6 use Adaptive Thinking by default, and /effort max is a more systematic approach when complex tasks continue over a session basis. Ultrathink is best suited when maximum thinking is required only for specific single requests during a session.

In Conclusion

ultrathink is an explicit tool that "gives AI time to think more deeply," and it is most effective when used selectively for high-complexity tasks.

The usage times can be summarized at a glance as follows:

Situation	Recommended Method
Most general development tasks	Adaptive Thinking (Default, no separate configuration required)
When complex tasks occur consecutively throughout the entire session	`/effort max`
When maximum inference is needed for only a single specific request during a session	`ultrathink` Keyword
When you want to use Extended Thinking in the API	`effort` parameter or `betas` header

3 Steps to Start Right Now:

Try typing /effort high in the Claude Code CLI. The inference depth for the entire session increases, and you can directly experience the change in response quality for subsequent requests.
Select one problem that you currently find most difficult to solve and try it with ultrathink. You can get better results by first attaching the relevant file with @파일명 and specifying the desired outcome specifically.
Try running an ultrathink request in claude --verbose mode. You can directly view the model's internal thinking process, which helps you understand how the AI actually reasons.

Next Post: How to Safely Perform Large-Scale Refactoring Using Claude Code's Plan Mode and /effort Combination

Key Concepts

Extended Thinking — Claude's "Thinking Inside" Method

Thinking Level Hierarchy — Token Budget Comparison by Keyword

2025–2026 Evolution — Why Ultrathink Disappeared and Then Returned

Practical Application

Example 1: Complex Architecture Decision

Example 2: Tracking an Unreproducible Bug

Example 3: Legacy Code Technical Debt Analysis

Example 4: Set the entire session using the /effort command (currently the recommended method)

Example 5: Combined with Plan Mode

Pros and Cons Analysis

Advantages

Disadvantages and Precautions

The Most Common Mistakes in Practice

In Conclusion

Reference Materials

Key Concepts

Extended Thinking — Claude's "Thinking Inside" Method

Thinking Level Hierarchy — Token Budget Comparison by Keyword

2025–2026 Evolution — Why Ultrathink Disappeared and Then Returned

Practical Application

Example 1: Complex Architecture Decision

Example 2: Tracking an Unreproducible Bug

Example 3: Legacy Code Technical Debt Analysis

Example 4: Set the entire session using the /effort command (currently the recommended method)

Example 5: Combined with Plan Mode

Pros and Cons Analysis

Advantages

Disadvantages and Precautions

The Most Common Mistakes in Practice

In Conclusion

Reference Materials

Recommended Posts

Claude Code: Safely Refactor Large-Scale Using Plan Mode + /effort max

CLAUDE.md Writing Guide: How to Communicate Project Rules to AI Agents

Claude Code MCP Setup Complete Guide — PostgreSQL, File System, and GitHub Integration

Mastering Claude Code Ultraplan: How to Delegate a Plan to a Remote Server and Reclaim the Terminal with `/ultraplan`

Turning Internal Legacy APIs into an AI-Understood MCP Server — A Complete Guide to Node.js + Claude Code Connection

Building an Autonomous DevOps Pipeline with Claude Code MCP

Example 4: Set the entire session using the `/effort` command (currently the recommended method)

Example 4: Set the entire session using the `/effort` command (currently the recommended method)