MCS — 02 · A Will Kline Design
search · ⌘K
MadCoolStuff.
The info hub for AI hardware.
Signal · yesterday
2026-05-28
Anthropic shipped Claude Opus 4.8 — 88.6% on SWE-bench Verified (up from 87.6%) and the strongest computer-use model it has tested (84% on Online-Mind2Web, ahead of GPT-5.5) — while holding the price at $5 / $25 per million tokens, the same as 4.7.
One number → 88.6 %
Specifier · advisor
Budget plus goal in. Three rigs out — every part from our catalog.
The model never quotes a number. Lean, recommended, headroom — streamed in parallel from the same hand-curated spec sheet that builds these pages.
Specify rigsArena
Build an AI rig. Fight the canonical boss. Share the replay.
Drag-and-drop Soul · Heart · Skeleton · Brain. Every fight is a deterministic, shareable URL.
Enter arenaReviews
Hands-on with real workloads. Benchmarks that mean something. Verdicts you can cut a PO against.
Browse reviewsGuides
Job-to-be-done buying advice, tiered by budget. "Local LLM under $4k" rather than "best GPU 2026."
Browse guidesChangelog
Hardware patch notes. When GPUs launch or models drop, the arena updates — we publish what changed.
Read the changelogLatest published
rss ↗Operating room
how this site publishes ↗Read-only views into the five-stage pipeline. Catch ingests, Cluster groups, Drift watches for silent revisions — every artifact lands as a public PR.
- feed →seven sources, raw
- clusters →what rhymes
- drift →silent revisions
- signal →editorial briefs
Last shipped
full log ↗- ccd0692fix(e2e): metrics catch-row count 8→10 + harden cmd-k open against listener race (#109)
- d65c967feat(catch): frontier-lab news + AI YouTube sources + closed-model relevance axis (#108)
- d08f6a7feat(signal): Opus 4.8 launch brief + synthesize leads on frontier model releases (#107)
- fcceab8feat(rigs): robotics enrichment — Phoenix, Spot, B2 + retag Jetson family (#106)
- 5eee6c4feat(rigs): robotics enrichment — Atlas, Optimus, Figure 02, Unitree H1 (#105)
The AI hardware space is loud and young. Reviews are either breathless launch coverage or dry whitepapers. Benchmarks are scattered across GitHub issues, Reddit threads, and vendor marketing. There's no trusted middle ground for operators who need to pick a GPU tier on Monday, a cluster topology by Friday, and an edge device by next quarter.
MadCoolStuff sits in that middle. Not breathless, not academic — the voice of someone who has to actually buy the thing.
