A flagship tier above Opus
Fable 5 opens a new class of model that sits above the Opus line. It is the model to reach for when Opus is the current ceiling on a problem and you need more.
Anthropic's most powerful, most intelligent model, and the first in a tier that sits above Opus. Reserved for the hardest reasoning and the longest-horizon autonomy. We route to it by difficulty, not by default, because the price reflects the ceiling.
Fable 5 is the most capable model Anthropic ships, positioned a rung above Opus. It is built for the problems where Opus is the current ceiling and you still want more headroom.
Twice the input price and twice the output price of Opus 4.8. This is a route-by-difficulty model. The economics work when one Fable call replaces a chain of Opus retries on a problem Opus cannot close.
The window matches Opus. What changes is what the model does inside it: deeper planning, longer coherent agentic runs, and stronger judgement on ambiguous, multi-step work.
Adaptive thinking only, no sampling parameters. One new wrinkle: an explicit disabled-thinking flag now returns a 400, so you omit the thinking parameter instead. Behind our abstraction this is a one-line change plus an eval pass.
Long-running autonomous agents, deep research, codebase-scale migrations, and the hardest reasoning tasks. For routine, high-volume work, Opus and Sonnet remain the right tiers.
From Anthropic's reported numbers. Fable 5 sets the family ceiling: it leads Opus 4.8 across coding, reasoning, computer use, knowledge work, and financial agentic tasks, and is the first Claude tier to lead GPT-5.5 on agentic terminal coding.
| Capability | Fable 5 | Opus 4.8 | GPT-5.5 | Gemini 3.1 Pro |
|---|---|---|---|---|
Agentic coding SWE-Bench Pro | 73.5% +4.3 pts vs Opus 4.8 | 69.2% | 58.6% | 54.2% |
Agentic terminal coding Terminal-Bench 2.1 Terminus-2 public harness | 79.4% +4.8 pts vs Opus 4.8 | 74.6% | 78.2% | 70.3% |
Multidisciplinary reasoning Humanity's Last Exam no tools / with tools | 53.1% / 61.0% +3.3 pts vs Opus 4.8 | 49.8% / 57.9% | 41.4% / 52.2% | 44.4% / 51.4% |
Agentic computer use OSWorld-Verified | 85.1% +1.7 pts vs Opus 4.8 | 83.4% | 78.7% | 76.2% |
Knowledge work GDPval-AA | 1972 +82 vs Opus 4.8 | 1890 | 1769 | 1314 |
Agentic financial analysis Finance Agent v2 | 57.2% +3.3 pts vs Opus 4.8 | 53.9% | 51.8% | 43.0% |
Numbers as reported by Anthropic at the Fable 5 launch. We re-run our own evals on customer tasks before recommending a switch, and the gain over Opus has to clear the price gap on your workload before we route to it.
What changes for the engineering team. Two comparisons that matter: Fable 5 against the model it sits above (Opus 4.8), and against Sonnet, where most routine volume still belongs.
| Dimension | vs Opus 4.8 | vs latest Sonnet |
|---|---|---|
Production agents | Higher ceiling on long-horizon autonomy and ambiguous, multi-step reasoning. The gap shows up most on the runs where Opus stalls or needs a human nudge. | Different universe of problem. Sonnet handles the high-volume, well-specified steps. Fable is for the few steps in a workflow that are genuinely hard. |
Coding workflows | +4.3 pts on SWE-Bench Pro and +4.8 pts on Terminal-Bench 2.1 vs Opus 4.8. The first Claude tier to lead GPT-5.5 on agentic terminal coding. | Sonnet remains the workhorse for local edits and review. Reserve Fable for migrations and architectural changes that Opus cannot close in one pass. |
Cost and latency | Twice the per-token price of Opus 4.8 ($10 / $50 per million). No fast mode at launch. This is the most expensive tier in the family by design. | An order of magnitude more expensive than Sonnet per token. The only sound way to run it is to route by difficulty inside the agent, not to set it as the default. |
Migration risk | Behind a vendor-neutral abstraction, adopting Fable is a routing change plus an eval pass. The one code-level change is dropping any explicit disabled-thinking flag. | Not a replacement for either Opus or Sonnet. We run all three behind the same abstraction and let the agent pick the tier at runtime. |
Inside a Kensink build, reaching for Fable is a routing decision the agent makes when difficulty warrants the spend, not a vendor commitment frozen at design time.
Fable 5 opens a new class of model that sits above the Opus line. It is the model to reach for when Opus is the current ceiling on a problem and you need more.
Like Opus 4.7 and 4.8, sampling parameters and fixed thinking budgets are gone. New on Fable: an explicit disabled-thinking flag returns a 400. Omit the thinking parameter entirely to run without it.
Fable caches prefixes from around 2,048 tokens, half the Opus floor of 4,096. Shorter shared preambles start paying off as cache reads sooner, which takes some sting out of the premium input price.
The effort parameter spans low through max, the same lever as Opus 4.6 and later. On the hardest tasks, higher effort plus a generous output budget is where Fable separates from Opus.
The full million-token window carries no long-context premium. The premium is on the per-token rate, not on using the whole window.
As the family ceiling, Fable 5 ships behind Anthropic's strongest safeguards, including real-time cybersecurity protections. Requests on prohibited or high-risk topics are more likely to be refused. The full assessment is in the Fable 5 System Card.
Fable inherits the honesty gains Anthropic reported for Opus 4.8: less likely to let flaws in its own output pass unremarked, more likely to flag uncertainty. On long agentic runs, that is the property that matters most.
A stronger model does not relax the need for task-specific evals on your data, your prompts, and your guardrails. We run the customer eval suite on every model change, and a more powerful tier raises the bar for what those evals must catch.
At twice the per-token price of Opus, Fable only makes sense for the steps where it changes the outcome. Inside our builds, the agent picks the tier at runtime by task difficulty, so spend tracks the hard parts of the workflow and nothing else.
A benchmark lead over Opus is not a reason to switch on its own. The gain has to be worth double the token cost on your tasks. We measure that on the customer eval suite before any routing rule sends real traffic to Fable.
Long-horizon autonomous agents, deep research, codebase-scale migrations, the reasoning that Opus cannot close. That is the Fable shortlist. Everything else stays on Opus or Sonnet.
Behind our vendor-neutral abstraction, adding Fable as a tier is a config and routing change plus an eval pass. The only code-level edit is dropping any explicit disabled-thinking flag, which now returns a 400.
Every model we integrate runs through the same operating system. Three pillars, sixteen layers, one Compound Growth Loop. The methodology that keeps AI work from rotting after the first ship.
Read the K-FrameworkDirect API integration with the model. No LangChain, no orchestration vendor, no agent framework built on quicksand. Typed contracts, the same way we wire up Postgres.
An eval suite built from your real tasks gates every prompt and model change. Quality is measured before it ships, not vibed in a demo.
Governance, audit, and oversight wired in from day one. Who called what, with which prompt version, at what cost. Your auditors get answers, not screenshots.
A model in production without observability is roulette. We instrument every integration so engineering and finance can see the same numbers, and so a regression at 3am surfaces before a customer opens a ticket.
Tokens in, tokens out, dollars spent. Sliced by feature, tenant, and route. Budgets enforced where it matters.
Real distributions, not averages. We know which routes are slow, and why.
The same eval suite that gates a release runs continuously in production. A regression on real traffic surfaces fast.
PII scrubbed at the proxy, shipped to your SIEM. Retention controls match your compliance window.
Dashboards your team owns, not ours. At handoff you get the queries, the alerts, and the runbook. We are not in the path to read your metrics.