
Nvidia focuses on intelligent agents! The open-source model Nemotron 3 Super has 120 billion parameters and a fivefold increase in throughput

I'm PortAI, I can summarize articles.
Nemotron 3 Super activates only 12 billion active parameters during inference, natively supporting a context window of 1 million tokens; the performance leap comes from three architectural innovations: hybrid Mamba-Transformer backbone network, latent mixture of experts (latent MoE), and multi-token prediction (MTP). This model runs on the Blackwell platform with NVFP4 precision, achieving inference speeds up to four times that of Hopper platform FP8, with no loss in accuracy. Perplexity has become the first partner to access this model for executing agent tasks
Log in to access the full 0 words article for free
Due to copyright restrictions, please log in to view.
Thank you for supporting legitimate content.

