Higher education
fromThe Verge
1 week agoAn AI announcer mispronounced and skipped names during a graduation
AI name-announcement tools can mispronounce or skip names during graduations due to timing issues, requiring pauses and human do-overs.
An initiative by a UK-based charity, supported by technology companies and universities, has developed an artificial intelligence (AI)-powered digital twin that allows people with communications disabilities to speak in a natural way. The technology, known as VoxAI, represents a step-change from the computer-assisted voice used by late physicist Stephen Hawking, one of the first well-known public figures with motor neurone disease (MND).
Among the recording equipment and memorabilia available to bid on are Schneider's 1964 Volkswagen van, the Panasonic bicycle he rode in the 1984 music video for a remix of "Tour De France," a number of woodwind and brass instruments-including the 1960s Orsi alto flute that appeared on the back cover of Kraftwerk's 1970 self-titled debut-and a rack case of Votrax speech synthesizer units, which the band used to create the robot voices that opened all of their concerts between 1981 and 2002.
The startup, headed by former Oculus co-founder and CEO Brendan Iribe and Ankit Kumar, former CTO of AR startup Ubiquity6, is working to create a personal AI agent that interacts with users using a natural-sounding human voice. The company plans to embed the personal AI agent into lightweight eyewear that is designed to be worn throughout the day and which users can interact with via voice.
Think you can distinguish between a human voice and a robot? Think again, because the numbers are starting to say otherwise. Researchers at Queen Mary University of London and University College London found that people can no longer reliably distinguish between genuine speech and cloned AI voices. Their study, published in open-access journal PLOS One, found that when people were played recordings of real people together with AI-generated versions of the same voices, their judgments were little better than random chance.
Voice-generation technology enables machines to synthesize human-like speech-text-to-speech (TTS)-revolutionizing digital communication by fostering more inclusive and accessible experiences. What began as simple robotic speech synthesis has evolved into highly sophisticated voice-cloning systems that can produce natural, coherent, expressive, and personalized voices using minimal data. These technologies empower individuals with cross-lingual communication through virtual agents, assist in overcoming visual or speech impairments or literacy challenges via assistive tools, and support educators and industries such as entertainment with creative content generation.
"Our main goal is creating a flexible speech neuroprosthesis that enables a patient with paralysis to speak as fluently as possible, managing their own cadence, and be more expressive by letting them modulate their intonation," says Maitreyee Wairagkar, a neuroprosthetics researcher at UC Davis who led the study.