NVIDIA Groq 3 LPU Unveiled at GTC 2026 — GPU+LPU Division and Samsung 4nm Foundry
▲ NVIDIA GTC 2026 GPU+LPU division of labor The Groq 3 LPU (Language Processing Unit) is a dedicated AI inference chip — specialized hardware designed solely for the "thinking" stage where AI models generate responses. NVIDIA unveiled it at GTC 2026, formally splitting the AI chip world in two: GPUs train, and LPUs infer. This strategic shift is closely tied to the rise of agentic AI — systems that autonomously think, act, and iterate in real time. Why Does Agentic AI Need a Dedicated Inference Chip? Agentic AI refers to systems that don't just respond to a single prompt — they reason through multi-step tasks, call external tools, and evaluate results in continuous loops. This loop-driven workflow demands extremely fast token generation, which is the core job of inference. GPUs are powerful at parallel computation for training, but generating tokens one at a time is not their strongest suit. In December 2025 , NVIDIA acquired inference startup Groq for $20 billion...