Claude Models 2026: Opus 4.8 vs Sonnet 4.6 vs Haiku

Q: What is the difference between Claude Opus and Claude Sonnet?

Opus is Anthropic's heavyweight tier — most capable at hard reasoning, complex coding and long agentic tasks, at $5/$25 per million tokens. Sonnet is the balanced middle tier — faster and cheaper at $3/$15, smart enough for the majority of work. Use Sonnet by default and switch to Opus when a task exceeds Sonnet's ceiling.

Last updated: June 2026 · Author: Ignacy Kwiecień, founder & editor-in-chief, DecodeTheFuture.org

As of June 2026, Anthropic’s Claude line has four current-generation names to understand: Claude Opus 4.8 (the deployable flagship — most capable for hard reasoning and long-horizon agentic work, $5/$25 per million tokens), Claude Sonnet 4.6 (the best speed-to-intelligence balance, $3/$15), Claude Haiku 4.5 (fastest and cheapest, $1/$5), and Claude Fable 5 (the most capable model overall, $10/$50 — but suspended worldwide since 12 June 2026 under a US export-control directive, so not currently available to most users). For most people the answer is Sonnet 4.6 for everyday work and Opus 4.8 when the task is genuinely hard — Opus 4.8 is the most capable model you can actually use today. This guide compares context windows, pricing, speed, availability and the right model for each job.

Claude Opus 4.8 Claude Sonnet 4.6 Claude Haiku 4.5 Claude Fable 5 Pricing

Table of Contents

The Claude model lineup at a glance (2026)

Anthropic ships Claude in tiers, each tuned for a different point on the intelligence-versus-cost curve. These are the current-generation models and their June 2026 availability status — older versions like Opus 4.6 and Opus 4.5 are still available but are no longer the default choice.

Model	API model ID	Context	Max output	Price (in / out per 1M)	Best for
Claude Fable 5 ⚠️	`claude-fable-5`	1M	128K	$10 / $50	Most capable — but suspended (US export controls, see below)
Claude Opus 4.8	`claude-opus-4-8`	1M	128K	$5 / $25	Flagship: hard coding, agents, knowledge work
Claude Opus 4.7	`claude-opus-4-7`	1M	128K	$5 / $25	Previous-gen Opus (still active)
Claude Sonnet 4.6	`claude-sonnet-4-6`	1M	64K	$3 / $15	Everyday work; best speed/intelligence balance
Claude Haiku 4.5	`claude-haiku-4-5`	200K	64K	$1 / $5	High-volume, latency-sensitive, cheap tasks

The names follow a simple pattern: Haiku is small and fast, Sonnet is the balanced middle, Opus is the heavyweight, and Fable sits above all of them on paper as Anthropic’s most capable model. The important availability distinction is that Fable 5 is suspended, so the practical top of the line is Opus 4.8. The number (4.8, 4.6, 4.5) is the generation — higher is newer and generally smarter at the same tier.

Which Claude model should you use?

Skip the spec sheet if you just want the decision. The honest rule of thumb for 2026:

If you’re doing…	Use	Why
Everyday chat, writing, summarizing, most app features	Sonnet 4.6	Fast, cheap enough to scale, smart enough for ~90% of tasks
Hard coding, multi-step agents, deep analysis	Opus 4.8	Top-tier reasoning and long-horizon execution
The hardest problems where cost is no object	Opus 4.8	Fable 5 would be the ceiling, but it is currently suspended under US export controls — Opus 4.8 is the practical top
Classification, routing, high-volume batch, simple extraction	Haiku 4.5	Cheapest and fastest; don’t pay Opus prices for simple jobs
Coding inside an IDE or terminal	Opus 4.8 via Claude Code or Cursor	Opus 4.8 is the default model in agentic coding tools

The single most common mistake is reaching for the biggest model by reflex. Opus 4.8 and Fable 5 are remarkable, but running classification or routine summarization on them is paying flagship prices for work Haiku does fine. The cost gap is large: Haiku 4.5 at $1/$5 is 5× cheaper on input and output than Opus 4.8, and 10× cheaper than Fable 5. Match the model to the difficulty of the task, not to the prestige of the name.

Claude Opus 4.8 — the flagship

Claude Opus 4.8 is the model to reach for when the task is genuinely hard. It is Anthropic’s most capable Opus-tier model — state-of-the-art on long-horizon agentic execution, complex coding, knowledge work and memory, with a notably clearer, warmer writing voice than the previous generation. It carries a 1M-token context window at standard pricing (no long-context premium) and up to 128K output tokens. It is the default model in agentic coding tools, and the one we benchmarked in our Cursor vs Claude Code comparison. For a deeper look at what changed in this release, see our Claude Opus 4.8 explainer.

Claude Sonnet 4.6 — the everyday default

Sonnet 4.6 is the model most teams should run by default. It hits the best balance of speed, cost and intelligence in the line: $3/$15 per million tokens, a 1M context window, and 64K max output. For chat interfaces, summarization, extraction, routing and the large majority of production app features, Sonnet 4.6 is fast enough to feel responsive and smart enough that you rarely notice the ceiling. Reserve Opus for the cases where Sonnet visibly struggles — that keeps your blended cost down without hurting quality where it counts.

Claude Haiku 4.5 — fastest and cheapest

Haiku 4.5 exists for volume and speed. At $1/$5 per million tokens with a 200K context window, it is the right tool for classification, content moderation, simple extraction, routing layers and anything you run at high throughput where latency and cost dominate. It is not the model for deep reasoning — but using Opus or Fable for tasks Haiku handles is one of the most common ways teams overspend on LLM bills.

Claude Fable 5 — the most capable model, currently suspended

Fable 5 is technically Anthropic’s most capable model — built for the most demanding reasoning and longest-horizon agentic work, at $10/$50 per million tokens, with thinking always on and a 30-day data-retention requirement. But there is a critical caveat that overrides everything else: Fable 5 was launched on 9 June 2026 and suspended worldwide three days later, on 12 June 2026, under a US export-control directive that blocks access for foreign nationals. In practice that means most users cannot run it today, and you should not plan production work around it until the status changes. We cover the suspension, its ties to the March 2026 supply-chain-risk designation, and the restricted Claude Mythos 5 sibling (Project Glasswing only) in Claude Fable 5 & Mythos 5 suspended and What is Claude Mythos. The upshot: for any model you can actually deploy right now, Opus 4.8 is the real ceiling.

Pricing compared — what you actually pay

All Claude API pricing is per million tokens, split into input (what you send) and output (what the model generates). Output is always more expensive than input.

Model	Input / 1M	Output / 1M	Relative cost
Haiku 4.5	$1.00	$5.00	1× (baseline)
Sonnet 4.6	$3.00	$15.00	3×
Opus 4.8 / 4.7	$5.00	$25.00	5×
Fable 5 (suspended)	$10.00	$50.00	10× on paper; not currently deployable

Two levers cut the bill regardless of model. Prompt caching charges cached input at roughly one-tenth the normal rate, so any large, stable prefix (a system prompt, a long document) you reuse across requests should be cached. And tiering your traffic — Haiku for the easy 80%, Sonnet for the middle, Opus only for the hard tail — almost always beats running everything on one big model.

How thinking and effort work across the line

One practical thing developers should know: the modern Claude models (Opus 4.6 and up, Sonnet 4.6, Fable 5) use adaptive thinking rather than a fixed “thinking budget.” You turn it on and the model decides how much to reason per request. You then tune the trade-off with an effort parameter (low / medium / high / xhigh / max) — higher effort means deeper reasoning and more tokens. On Fable 5, Opus 4.8 and 4.7 the old fixed-budget thinking and the sampling parameters (temperature, top_p) have been removed entirely; steer behaviour with prompting and effort instead.

Python

# Same call, three models — swap the ID to move up or down a tier.
import anthropic
client = anthropic.Anthropic()

resp = client.messages.create(
    model="claude-opus-4-8",            # or "claude-sonnet-4-6" / "claude-haiku-4-5"
    max_tokens=16000,
    thinking={"type": "adaptive"},      # model decides how much to reason
    output_config={"effort": "high"},   # low | medium | high | xhigh | max
    messages=[{"role": "user", "content": "Plan a migration from REST to gRPC."}],
)
print(resp.content[0].text)

# Cheap, high-volume work? Drop to Haiku and lower the effort:
#   model="claude-haiku-4-5", output_config={"effort": "low"}

✅ The one-line takeaway

Default to Sonnet 4.6. Step up to Opus 4.8 when the task is hard, drop to Haiku 4.5 when it’s simple and high-volume, and treat Fable 5 as unavailable until the export-control suspension is lifted. Tier your traffic and cache your prompts — that matters more than which single model you pick.

FAQ

What is the best Claude model in 2026?

On paper Claude Fable 5 is the most capable, but it has been suspended worldwide since 12 June 2026 under a US export-control directive — so for any model you can actually use, Claude Opus 4.8 is the best. “Best” then depends on the task: Opus 4.8 is the best general-purpose flagship, Sonnet 4.6 is the best balance of speed and cost for everyday work, and Haiku 4.5 is best for cheap, high-volume tasks. Most users should default to Sonnet 4.6 and step up to Opus 4.8 only for hard problems.

What is the difference between Claude Opus and Claude Sonnet?

Opus is Anthropic’s heavyweight tier — the most capable at hard reasoning, complex coding and long agentic tasks, at $5/$25 per million tokens. Sonnet is the balanced middle tier — faster and cheaper at $3/$15, smart enough for the large majority of work. Use Sonnet by default and switch to Opus when a task visibly exceeds Sonnet’s ceiling.

How much do Claude models cost?

Per million tokens (input / output): Claude Haiku 4.5 is $1 / $5, Sonnet 4.6 is $3 / $15, Opus 4.8 and 4.7 are $5 / $25, and Fable 5 is $10 / $50. Output tokens always cost more than input. Prompt caching charges reused input at roughly one-tenth the standard rate.

What is Claude Fable 5?

Claude Fable 5 is technically Anthropic’s most capable model — built for the hardest reasoning and longest agentic work, $10/$50 per million tokens, 1M context, thinking always on, 30-day data retention. However, it launched on 9 June 2026 and was suspended worldwide on 12 June 2026 under a US export-control directive blocking foreign nationals, so most users cannot run it right now. For a model you can actually deploy, Opus 4.8 is the ceiling. Claude Mythos 5 is a restricted sibling available only through Project Glasswing.

Which Claude model is best for coding?

Claude Opus 4.8 is the strongest coding model you can actually use today and the default in agentic coding tools like Claude Code and Cursor. For lighter or high-volume coding tasks, Sonnet 4.6 is often enough and much cheaper. Fable 5 would be the ceiling for the absolute hardest, long-horizon engineering work, but it is suspended right now. See our Cursor vs Claude Code comparison for how the tooling differs.

What context window do Claude models have?

Claude Opus 4.8, Opus 4.7, Sonnet 4.6 and Fable 5 all have a 1 million-token context window. Claude Haiku 4.5 has a 200K-token window. Maximum output is 128K tokens for Opus and Fable, and 64K for Sonnet and Haiku (large outputs require streaming).

Do I need extended thinking or a thinking budget?

On the current models (Opus 4.6+, Sonnet 4.6, Fable 5) the fixed “thinking budget” is gone — use adaptive thinking and control depth with the effort parameter (low through max) instead. On Opus 4.7, Opus 4.8 and Fable 5 the old budget_tokens and sampling parameters are removed entirely; sending them returns an error.

Bibliography (8 sources)

Sources prioritise Anthropic’s official model and pricing documentation. Model availability and pricing change frequently; verify current figures against Anthropic’s site before relying on them commercially. Links accessed June 2026.

Anthropic — Models overview. Primary source for model IDs, context windows and capabilities.
Anthropic — Pricing. Per-million-token input/output rates for each model.
Anthropic — Introducing Claude Fable 5. Capabilities, API behaviour and availability of Fable 5 / Mythos 5.
Anthropic — Adaptive thinking & effort. How thinking and the effort parameter work on current models.
DTF — Claude Opus 4.8 Explained. Deep dive on the flagship model.
DTF — Cursor vs Claude Code 2026. Opus 4.8 in agentic coding tools.
DTF — What is Claude Mythos. The restricted Mythos line and Project Glasswing.
DTF — Claude Fable 5 & Mythos 5 suspended. Timeline and implications of the June 12 US export-control suspension.

Claude Models 2026: Opus 4.8 vs Sonnet 4.6 vs Haiku

The Claude model lineup at a glance (2026)

Which Claude model should you use?

Claude Opus 4.8 — the flagship

Claude Sonnet 4.6 — the everyday default

Claude Haiku 4.5 — fastest and cheapest

Claude Fable 5 — the most capable model, currently suspended

Pricing compared — what you actually pay

How thinking and effort work across the line

FAQ

Cursor vs Claude Code 2026 — Honest Head-to-Head

Best AI Trading Platforms 2026: 7 Tools Compared

Best AI Agent Frameworks 2026: 9 Compared Head-to-Head

LEAVE A REPLY Cancel reply

Most Popular

What Is Level 3 Market Data? Full Order Book 2026

Cursor vs Claude Code 2026 — Honest Head-to-Head

Crypto CFDs Explained: How They Work + 2026 Rules

Plus500 Review 2026: Fees, Safety & Is It Worth It?

Recent Comments

Inwestowanie

What Is Level 3 Market Data? Full Order Book 2026

Cursor vs Claude Code 2026 — Honest Head-to-Head

Crypto CFDs Explained: How They Work + 2026 Rules

POPULAR POSTS

What Is Level 3 Market Data? Full Order Book 2026

Cursor vs Claude Code 2026 — Honest Head-to-Head

Crypto CFDs Explained: How They Work + 2026 Rules

POPULAR CATEGORY

ABOUT US

FOLLOW US