NVIDIA Releases Open Models, Datasets, and Tools Across AI, Robotics, and Autonomous Driving
Briefly

NVIDIA Releases Open Models, Datasets, and Tools Across AI, Robotics, and Autonomous Driving
"NVIDIA has released a set of open models, datasets, and development tools covering language, agentic systems, robotics, autonomous driving, and biomedical research. The update expands several existing NVIDIA model families and makes accompanying training data and reference implementations available through GitHub, Hugging Face, and NVIDIA's developer platforms. In the agentic AI domain, NVIDIA extended the Nemotron model family with new components for speech recognition, retrieval-augmented generation, and safety. Nemotron Speech includes automatic speech recognition models optimized for low-latency, real-time use cases."
"For robotics and physical AI, NVIDIA introduced new Cosmos world foundation models, which support perception, reasoning, and synthetic data generation in real-world environments. Cosmos Reason 2 is a multimodal reasoning model designed to enhance scene understanding for agents operating in physical environments. Cosmos Transfer 2.5 and Cosmos Predict 2.5 focus on generating synthetic video data across varied environments and conditions, supporting simulation and data augmentation workflows."
"In the agentic AI domain, NVIDIA extended the Nemotron model family with new components for speech recognition, retrieval-augmented generation, and safety. Nemotron RAG introduces embedding and reranking vision-language models intended for multimodal document search and retrieval pipelines. Nemotron Safety adds updated models for content filtering and detection of sensitive or personally identifiable information. NVIDIA also released datasets and training code used for selected Nemotron models, including embedding models evaluated on public benchmarks."
NVIDIA released open models, datasets, and development tools spanning language, agentic systems, robotics, autonomous driving, and biomedical research. The update expands multiple NVIDIA model families and provides training data and reference code via GitHub, Hugging Face, and NVIDIA developer platforms. In agentic AI, Nemotron gains components for speech recognition, retrieval-augmented generation, and safety, with ASR optimized for low-latency and vision-language embeddings for multimodal retrieval. For robotics, new Cosmos foundation models support perception, multimodal reasoning, and synthetic video generation; Cosmos-based Isaac GR00T enables vision-language-action control for humanoid robots. A new Alpamayo family targets reasoning-based autonomous driving with perception, planning, and explainability.
Read at InfoQ
Unable to calculate read time
[
|
]