Groq · accelerator
Verified 2026-05-01
Groq Language Processing Unit
Inference-only ASIC trading HBM for on-die SRAM — deterministic latency for LLM serving.
Groq's first-generation LPU is a 14nm inference accelerator with 230 MB of on-chip SRAM and ~80 TB/s on-die memory bandwidth. The PCIe Gen4 GroqCard delivers up to 750 INT8 TOPS and 188 FP16 TFLOPS, scaling out via 11 RealScale chip-to-chip links.
Specs
- compute
- Groq LPU (14nm)
- form factor
- PCIe Gen4 x16