Nari Labs has introduced Dia, a novel AI model designed to generate podcast-style clips, which allows users to control aspects like tone and nonverbal cues. Co-founders, Toby Kim and his partner, developed Dia with minimal prior experience in AI within three months, aiming to provide features akin to Google's NotebookLM. The model has 1.6 billion parameters and is available on platforms like Hugging Face and GitHub. Although Dia shows promise in generating quality dialogue and voice cloning, it raises concerns due to minimal built-in safeguards against misuse.
The emergence of Nari Labs' Dia model signifies the growing potential in the AI-generated synthetic speech market, showcasing innovation by new entrants with limited expertise.
Investors are rallying behind synthetic speech technologies, as evidenced by the $398 million raised by voice AI startups last year, indicating high market demand.
Toby Kim, co-founder of Nari Labs, emphasizes that the goal of Dia is to provide users more control over voice generation, including tone customization and nonverbal cues.
While Dia performs well in generating dialogues and cloning voices, it raises concerns regarding the lack of safeguards typically associated with voice generation tools.
Collection
[
|
...
]