Hands-on with the hardware.
Real workloads, real benchmarks, real verdicts.
- RTX PRO 6000 Blackwell: 96 GB on a single workstation card
NVIDIA · workstation
buy-if
Triples the 5090's frame buffer to 96 GB ECC on the same Blackwell silicon. Same 1,792 GB/s bandwidth, so per-token throughput tracks the 5090 — you pay the multiple for capacity, ECC, and pro drivers.
2026-05-01 · 8/10
- RTX 5080: 16 GB is the ceiling, not the floor
NVIDIA · gpu
buy-if
Half the 5090's VRAM at half the price. 16 GB caps you at 14B Q8 or 32B Q4 — the same ceiling a 4080 hit two years ago. A halo gaming card with AI as a side benefit.
2026-05-01 · $999 · 7/10
- DGX Spark: NVIDIA in a desk-sized box
NVIDIA · workstation
buy-if
GB10 Grace Blackwell in a 1.2 kg chassis with 128 GB unified memory. Capacity wins, bandwidth loses. A development box, not a throughput box.
2026-05-01 · $4,699 · 7/10
- Mac Studio M3 Ultra: the patient context machine
Apple · workstation
buy-if
Apple's M3 Ultra Mac Studio fits 100B+ class models a 5090 has to page or skip. First-token latency lags; total throughput wins when the model doesn't fit elsewhere. A specialist box, not a generalist.
2026-05-01 · $7,999 · 8/10
- RTX 5090 for local LLM inference: the new watermark
NVIDIA · gpu
buy
32 GB VRAM and Blackwell sm_120, enough to run 32B-class models at high quants without paging or 70B at IQ3 with care. Worth the jump from a 4090 if you live in llama.cpp.
2026-04-24 · $1,999 · 9/10