Indian AI lab Sarvam's new models are a major bet on the viability of open-source AI | TechCrunch
Briefly

Indian AI lab Sarvam's new models are a major bet on the viability of open-source AI | TechCrunch
"Sarvam said the new lineup includes 30-billion and 105-billion parameter models; a text-to-speech model; a speech-to-text model; and a vision model to parse documents. These mark a sharp upgrade from the company's 2-billion-parameter Sarvam 1 model that it released in October 2024. The 30-billion- and 105-billion-parameter models use a mixture-of-experts architecture, which activates only a fraction of their total parameters at a time, significantly reducing computing costs, Sarvam said."
"Sarvam said the new AI models were trained from scratch rather than fine-tuned on existing open-source systems. The 30B model was pre-trained on about 16 trillion tokens of text, while the 105B model was trained on trillions of tokens spanning multiple Indian languages, it said. The models are designed to support real-time applications, the startup said, including voice-based assistants and chat systems in Indian languages."
Sarvam unveiled a new generation of AI models including 30-billion and 105-billion parameter large language models, a text-to-speech model, a speech-to-text model, and a vision model for document parsing. The 30B and 105B models employ a mixture-of-experts architecture to activate only a fraction of parameters at a time, lowering compute costs. The 30B supports a 32,000-token context window for real-time conversational use and the 105B offers a 128,000-token window for complex multi-step reasoning. The 30B was pretrained on about 16 trillion tokens and the 105B on trillions spanning multiple Indian languages. Training used resources from the IndiaAI Mission, Yotta, and Nvidia, and the models target real-time voice assistants and chat systems while the company plans measured scaling focused on practical applications.
Read at TechCrunch
Unable to calculate read time
[
|
]