Brief · 28 May 2026 · MadCoolStuff

Anthropic shipped Claude Opus 4.8 today, and the number that matters to anyone pricing an inference budget isn't the top-line benchmark — it's the price that didn't move. Opus 4.8 holds at $5 / $25 per million input/output tokens, the same as 4.7, while posting 88.6% on SWE-bench Verified (up from 87.6%) and 93.6% on GPQA Diamond. A frontier coding model that gets cheaper per unit of capability each release tightens the make-vs-buy line for anyone weighing a local 70B-class rig against an API subscription — the breakeven token volume just moved.

The real movement is agentic. Opus 4.8 is now the strongest computer-use and browser-agent model Anthropic has tested — 84% on Online-Mind2Web, a clear step over both Opus 4.7 and GPT-5.5 — alongside an optional 2.5× fast mode and parallel-subagent workflows in Claude Code. If your rig is doing tool-calling or browser automation, the orchestration ceiling went up today.

What to watch

The eye-catching agentic figures bundle the harness with the model. The headline 88.5% on BrowseComp is measured with a multi-agent orchestrator; the single-model gain is the more honest +5.0 points. Independent breakdowns put the base-model jump in perspective — treat the multi-agent numbers as a ceiling you have to engineer toward, not a default you get for free. The price hold is the durable story; the orchestrator records are a benchmark you build, not a model you buy.