As explained by Meta: AI-powered translations for Reels are starting to roll out in more languages, including Bengali, Tamil, Telugu, Marathi, and Kannada, on Instagram. These new additions build on our existing language support for English, Hindi, Portuguese, and Spanish. The addition of more of the languages spoken in India is significant, because India is now the biggest single market for both Facebook and Instagram usage, beating out the U.S. by a significant margin.
On Wednesday, the Paris-based AI lab released two new speech-to-text models: Voxtral Mini Transcribe V2 and Voxtral Realtime. The former is built to transcribe audio files in large batches and the latter for nearly real-time transcription, within 200 milliseconds; both can translate between 13 languages. Voxtral Realtime is freely available under an open source license.
By comparing how AI models and humans map these words to numerical percentages, we uncovered significant gaps between humans and large language models. While the models do tend to agree with humans on extremes like 'impossible,' they diverge sharply on hedge words like 'maybe.' For example, a model might use the word 'likely' to represent an 80% probability, while a human reader assumes it means closer to 65%.
The dataset was created by translating non-English content from the FineWeb2 corpus into English using Gemma3 27B, with the full data generation pipeline designed to be reproducible and publicly documented. The dataset is primarily intended to improve machine translation, particularly in the English→X direction, where performance remains weaker for many lower-resource languages. By starting from text originally written in non-English languages and translating it into English, FineTranslations provides large-scale parallel data suitable for fine-tuning existing translation models.
A major difference between LLMs and LTMs is the type of data they're able to synthesize and use. LLMs use unstructured data-think text, social media posts, emails, etc. LTMs, on the other hand, can extract information or insights from structured data, which could be contained in tables, for instance. Since many enterprises rely on structured data, often contained in spreadsheets, to run their operations, LTMs could have an immediate use case for many organizations.
What happens under the hood? How is the search engine able to take that simple query, look for images in the billions, trillions of images that are available online? How is it able to find this one or similar photos from all that? Usually, there is an embedding model that is doing this work behind the hood.