NVIDIA's brand new open-source model: three times the throughput, can run on a single card, and has achieved state-of-the-art inference

Wallstreetcn
2025.07.29 07:10
portai
I'm PortAI, I can summarize articles.

NVIDIA launched the Llama Nemotron Super v1.5 open-source model, designed specifically for complex reasoning and agent tasks. This model achieves state-of-the-art performance in fields such as science, mathematics, and programming, with a throughput increase of up to three times compared to its predecessor, and can run efficiently on a single card. It employs Neural Architecture Search (NAS) technology to optimize accuracy and efficiency while reducing operational costs. The model architecture includes skip attention mechanisms and variable feedforward networks, enhancing performance and efficiency