Skip to content

Groq · accelerator

Verified 1mo ago

Groq Language Processing Unit

Inference-only ASIC trading HBM for on-die SRAM — deterministic latency for LLM serving.

Stylized line drawing of the Groq Language Processing Unit

Groq's first-generation LPU is a 14nm inference accelerator with 230 MB of on-chip SRAM and ~80 TB/s on-die memory bandwidth. The PCIe Gen4 GroqCard delivers up to 750 INT8 TOPS and 188 FP16 TFLOPS, scaling out via 11 RealScale chip-to-chip links.

Specs

compute
Groq LPU (14nm)
form factor
PCIe Gen4 x16

Deterministic-latency inference ASIC. Trades HBM capacity for on-die SRAM bandwidth.