>_CLI agent· 5 models rankedAnthropic only

Best Claude Models for Claude Code

Anthropic's terminal coding agent, pointed at a local model

Claude Code is Anthropic's agentic coder, and it runs only on Anthropic's own models, so "which model" really means picking across the Claude lineup: Opus, Sonnet, and Haiku. The interesting part is token usage. Opus 4.8 carries the same $5/$25 sticker price as Opus 4.7, but finishes the same work in roughly 35% fewer output tokens and 15% fewer turns, so the effective cost per task is meaningfully lower on the newer model even though the per-token rate is identical. This page ranks every Claude model Claude Code can drive, by capability and by real token efficiency.

Best pick

Claude Opus 4.8Cloud API

The default to beat: top agentic tool use and long-horizon coding, and the most token-efficient Opus. Same $5/$25 as 4.7 but ~35% fewer output tokens per task.

What Claude Code needs

Claude Code chains many tool calls over a long, growing context, so it rewards models with strong agentic tool use and ample context. Because every turn spends output tokens, token efficiency directly drives your bill.

Anthropic models, ranked

Claude Opus 4.8Cloud APIAnthropic · Flagship$5 / $25 $/MTok · 200K

The default to beat: top agentic tool use and long-horizon coding, and the most token-efficient Opus. Same $5/$25 as 4.7 but ~35% fewer output tokens per task.

↘ −35% output tokens vs 4.7

fast mode: $10 / $50 per MTok

Claude Opus 4.7Cloud APIAnthropic · Prior gen$5 / $25 $/MTok · 200K

The prior flagship and the efficiency baseline. Still excellent; 4.8 matches or beats its quality while spending fewer tokens.

↘ baseline · prior flagship

Claude Opus 4.6Cloud APIAnthropic · Prior gen$5 / $25 $/MTok · 200K

Two generations back at the same price. Fine if pinned for reproducibility, but 4.8 is the better default.

Claude Sonnet 4.6Cloud APIAnthropic · Balanced$3 / $15 $/MTok · 200K

The fast, cheaper daily driver at $3/$15. Most sessions run great here; save Opus for the hardest multi-step tasks.

Claude Haiku 4.5Cloud APIAnthropic · Budget$1 / $5 $/MTok · 200K

Cheapest at $1/$5; great for simple edits and quick turns, under-powered for long autonomous loops.

Token efficiency: output tokens per task

Same per-token price, fewer tokens used, so the newer model is cheaper per task. Shorter bars are better; only models with a vendor-published figure appear.

Claude Opus 4.8

−35% output tokens vs 4.7

Claude Opus 4.7

baseline · prior flagship

Relative output tokens per task · index 100 = Claude Opus 4.7 baseline · sourced figures only

How to use it

Claude Code authenticates with your Anthropic account or API key and picks a model automatically (Opus on the hardest tasks, Sonnet for fast daily work). Override with the model selector or the ANTHROPIC_MODEL env var; enable fast mode on Opus 4.8 when you want 2.5× speed at the lower fast-mode rate.

Claude Code official site ↗

New open-weight models, real Apple Silicon benchmarks, and the one model worth running on your Mac this week. Free, one email a week, unsubscribe anytime.

By subscribing you agree to our Privacy Policy and to receive the weekly email. Unsubscribe anytime.

Frequently Asked Questions

Which Claude model is best for Claude Code?+

Claude Opus 4.8. It leads on agentic tool use and long-horizon coding, and it is the most token-efficient Opus. The same $5/$25 per-million pricing as Opus 4.7 but roughly 35% fewer output tokens per task. Drop to Sonnet 4.6 for fast everyday work and Haiku 4.5 for cheap, simple turns.

Opus 4.8 and 4.7 cost the same, so why is 4.8 cheaper to run?+

Because cost = price per token × tokens used. Both are $5 in / $25 out per million tokens, but Opus 4.8 completes the same tasks in about 35% fewer output tokens and 15% fewer turns (Artificial Analysis, on GDPval-style work), so the same job spends fewer billed tokens. Fast mode on 4.8 is also 3× cheaper than the previous fast mode.

Can Claude Code run on local or non-Anthropic models?+

Not officially. Claude Code is built around Anthropic's models and API. Community proxies (e.g. LiteLLM translating the Messages API) can point it at a local model, but reliability drops because the agent depends on Anthropic-grade structured tool-calling. For self-hosting an open-weight Claude-Code-style agent, see Open Claude Code in our tools list.