
NVIDIA announces the creation of a full-blooded DeepSeek inference world record
NVIDIA announced the world record-setting DeepSeek-R1 inference performance at NVIDIA GTC 2025. A single NVIDIA DGX system equipped with eight NVIDIA Blackwell GPUs can achieve over 250 tokens per user per second, or a maximum throughput of over 30,000 tokens per second, on the fully powered 671 billion parameter DeepSeek-R1 model. (AI Cambrian)

