Nvidia releases Nemotron 3 Nano Omni: open multimodal model with 30B params, 3B active, for edge AI agents

"Nemotron 3 Nano Omni is designed to power autonomous AI agents on edge devices, utilizing a mixture-of-experts design that activates only three billion parameters per forward pass, allowing it to run efficiently on a single GPU."

"Nvidia claims nine times higher throughput than comparable open multimodal models, achieving 2.9 times faster single-stream reasoning on multimodal tasks and significantly enhancing effective system capacity for video reasoning."

"The model processes a wide range of inputs, including text, images, audio, video, documents, charts, and graphical interfaces, producing text outputs, thereby replacing the need for multiple specialized models in enterprise AI deployments."

Nvidia launched Nemotron 3 Nano Omni, a multimodal AI model that integrates vision, audio, and language in a single architecture. It features 30 billion parameters but activates only three billion during inference, allowing efficient operation on a single GPU. The model boasts nine times the throughput of similar models and excels in six benchmarks related to document intelligence, video understanding, and audio comprehension. It processes various input types and is available for commercial use under Nvidia's Open Model Agreement, positioning Nvidia as a competitor in both AI infrastructure and model development.

#nvidia #ai-models #multimodal-ai #edge-ai #machine-learning

Read at TNW | Next-Featured

Unable to calculate read time

Collection

[

...

]

Nvidia releases Nemotron 3 Nano Omni: open multimodal model with 30B params, 3B active, for edge AI agentsNvidia releases Nemotron 3 Nano Omni: open multimodal model with 30B params, 3B active, for edge AI agents Briefly

Nvidia releases Nemotron 3 Nano Omni: open multimodal model with 30B params, 3B active, for edge AI agents
Nvidia releases Nemotron 3 Nano Omni: open multimodal model with 30B params, 3B active, for edge AI agents
Briefly