On Tuesday, August 22nd, Mark Zuckerberg, CEO of Meta, unveiled the upcoming release of an innovative artificial intelligence-based translator, SeamlessM4T. This groundbreaking tool has the ability to work with over 100 languages, enabling translation and transformation into audio and transcription.
Meta stated that SeamlessM4T will be available as open source, along with SeamlessAlign, a new translation dataset. The company emphasizes that this new translator represents a significant advancement in the field of AI-powered voice-to-voice and voice-to-text conversion.
This unique translation model provides on-demand results, allowing individuals who speak different languages to communicate effectively. A standout feature of SeamlessM4T is its implicit recognition of source languages without the need for a separate language identification model.
The foundation of SeamlessM4T is Massively Multilingual Speech, a framework developed by Meta that offers speech recognition technology, language identification, and speech synthesis in over 1,100 languages. The creation of this translator involved the alignment of 443,000 hours of speech with texts, generating 29,000 hours of “voice to voice” alignments. This allowed the system to learn how to transcribe voice to text, translate text, and generate voice from text, including translating spoken words in one language into words in another language.
CEO Mark Zuckerberg expressed his enthusiasm for this project and how he plans to integrate this technology into Meta’s flagship services. “Over time, we will integrate these translation and transcription advances through artificial intelligence into Facebook, Instagram, WhatsApp, Messenger, and Threads“, he noted.
SeamlessM4T not only facilitates translation between languages but also focuses on aspects such as “code-switching” recognition and the detection of toxic and hateful words. Furthermore, it has the ability to quantify gender bias and adapt to pronoun variations in over 100 languages.
The launch of SeamlessM4T is developer-focused, allowing them to collaborate with Meta to improve the model before its release to the general public.