Skip to content

NVIDIA · workstation

Verdict · buy-if

RTX PRO 6000 Blackwell: 96 GB on a single workstation card

Triples the 5090's frame buffer to 96 GB ECC on the same Blackwell silicon. Same 1,792 GB/s bandwidth, so per-token throughput tracks the 5090 — you pay the multiple for capacity, ECC, and pro drivers.

Product
NVIDIA RTX PRO 6000 Blackwell Workstation Edition
Published
2026-05-01T00:00:00.000Z
Score
8 / 10

Pros

  • 96 GB GDDR7 ECC fits 70B-class at Q8 with full context on one card
  • Dual-slot at 600 W still drops into a workstation chassis without exotic cooling
  • ECC plus the NVIDIA RTX Enterprise driver stack — long support cycle, signed builds

Cons

  • Memory bandwidth is 1,792 GB/s — identical to a 5090, so tokens-per-second scales with capacity, not speed
  • Partner pricing runs a multiple of a 5090; no published vendor MSRP to anchor against
const{Fragment:e,jsx:t,jsxs:n}=arguments[0];function _createMdxContent(o){const i={h2:"h2",li:"li",p:"p",ul:"ul",...o.components};return n(e,{children:[t(i.h2,{children:"What we tested"}),"\n",t(i.p,{children:"A single PRO 6000 in a Threadripper PRO chassis on PCIe Gen 5. Driver: NVIDIA RTX Enterprise. Workloads: a 70B-class dense model at Q8 with full context; a 100B+ MoE at IQ4; a multi-LoRA Flux stack for image generation; a short SDXL video pipeline with two LoRAs hot."}),"\n",t(i.p,{children:"The headline is the one number that matters here — 96 GB. The 70B Q8 fits with KV-cache headroom; you do not pick a quant tier to get under the wire. The MoE at IQ4 fits where a 5090 forces you down to IQ3 or off-card entirely. The multi-LoRA stacks fit because LoRAs cost capacity, not bandwidth, and capacity is what this card sells."}),"\n",t(i.h2,{children:"What you'll feel"}),"\n",t(i.p,{children:'If you came from a 5090, the difference is what stops happening. No KV-cache truncation at long context. No swapping LoRAs in and out between generations. No "this batch won\'t fit, drop to fp8 attention." Per-token throughput on the dense 70B is in the same neighborhood as a 5090 running a smaller quant of the same model — the bandwidth is the same 1,792 GB/s, the SM count is higher, and the math works out close.'}),"\n",t(i.p,{children:"What you will not feel is a speed jump on workloads that already fit on a 5090. Same bandwidth, same memory subsystem class. If your model fits in 32 GB at the quant you want, this card runs it at roughly 5090 pace, not faster."}),"\n",t(i.h2,{children:"Setup notes"}),"\n",t(i.p,{children:"Dual-slot, 5.4 in H x 12.0 in L — it physically drops in. 600 W max means a real PSU (1000 W class minimum with headroom) and a 12V-2x6 cable rated for it. Four DisplayPort 2.1 outputs, no HDMI. PCIe Gen 5 — pair it with a Gen 5 platform or the bandwidth on the slot becomes the conversation. ECC is on by default in the enterprise driver; leave it on."}),"\n",t(i.h2,{children:"Who should buy"}),"\n",n(i.ul,{children:["\n",t(i.li,{children:"Engineers running 70B-class dense models or 100B+ MoE on a single workstation, where capacity is the constraint."}),"\n",t(i.li,{children:"Anyone who needs ECC for a research workflow that has to be reproducible."}),"\n",t(i.li,{children:"Studios running multi-LoRA stacks where the LoRA count is the bottleneck."}),"\n"]}),"\n",t(i.h2,{children:"Who should skip"}),"\n",n(i.ul,{children:["\n",t(i.li,{children:"You are running 32B-class or smaller and a 5090 fits your quant tier."}),"\n",t(i.li,{children:"You need raw tokens-per-second on a model that already fits — buy a second 5090 instead, the bandwidth math favors it."}),"\n"]}),"\n",t(i.h2,{children:"Bottom line"}),"\n",t(i.p,{children:"96 GB on one card with ECC and the enterprise driver. Same Blackwell bandwidth as the consumer 5090, so the win is capacity, not speed. Partner pricing varies and runs a multiple of a 5090; the buy decision is whether your model size makes that multiple worth it. If 32 GB is the wall you keep hitting, this is the answer."})]})}return{default:function(e={}){const{wrapper:n}=e.components||{};return n?t(n,{...e,children:t(_createMdxContent,{...e})}):_createMdxContent(e)}};