Skip to content

Groq · accelerator

Verified 2026-05-01

Groq Language Processing Unit

Inference-only ASIC trading HBM for on-die SRAM — deterministic latency for LLM serving.

Groq's first-generation LPU is a 14nm inference accelerator with 230 MB of on-chip SRAM and ~80 TB/s on-die memory bandwidth. The PCIe Gen4 GroqCard delivers up to 750 INT8 TOPS and 188 FP16 TFLOPS, scaling out via 11 RealScale chip-to-chip links.

Specs

compute
Groq LPU (14nm)
form factor
PCIe Gen4 x16
const{jsx:n}=arguments[0];function _createMdxContent(e){const t={p:"p",...e.components};return n(t.p,{children:"Deterministic-latency inference ASIC. Trades HBM capacity for on-die SRAM bandwidth."})}return{default:function(e={}){const{wrapper:t}=e.components||{};return t?n(t,{...e,children:n(_createMdxContent,{...e})}):_createMdxContent(e)}};