Groq Language Processing Unit

Name: Groq Language Processing Unit
Brand: Groq

Inference-only ASIC trading HBM for on-die SRAM — deterministic latency for LLM serving.

Stylized line drawing of the Groq Language Processing Unit

Groq's first-generation LPU is a 14nm inference accelerator with 230 MB of on-chip SRAM and ~80 TB/s on-die memory bandwidth. The PCIe Gen4 GroqCard delivers up to 750 INT8 TOPS and 188 FP16 TFLOPS, scaling out via 11 RealScale chip-to-chip links.

Compare with another

Specs

compute: Groq LPU (14nm)
form factor: PCIe Gen4 x16

Deterministic-latency inference ASIC. Trades HBM capacity for on-die SRAM bandwidth.