“Today we’re releasing SeamlessM4T, a new multimodal AI model that lets people who speak different languages communicate more effectively,” Meta CEO Mark Zuckerberg said in a post on his Instagram Channel.
Zuckerberg said that the AI model can do speech-to-text, text-to-speech, speech-to-speech, text-to-text translation and speech recognition for up to 100 languages.
The company plans to integrate the AI model in translation and transcription into Facebook, Instagram, WhatsApp, Messenger and Threads.
How SeamlessM4T AI model works
According to Meta, the model supports speech recognition in up to 100 languages, however, the number is less when it comes to text-to-speech translation.
- Speech recognition for nearly 100 languages
- Speech-to-text translation for nearly 100 input and output languages
- Speech-to-speech translation, supporting nearly 100 input languages and 36 (including English) output languages
- Text-to-text translation for nearly 100 languages
- Text-to-speech translation, supporting nearly 100 input languages and 35 (including English) output languages
“In keeping with our approach to open science, we’re publicly releasing SeamlessM4T under a research licence to allow researchers and developers to build on this work. We’re also releasing the metadata of SeamlessAlign, the biggest open multimodal translation dataset to date, totaling 270,000 hours of mined speech and text alignments,” the company said.
According to Meta, SeamlessM4T builds on previous advancements in this space, such as last year’s No Language Left Behind (NLLB), a text-to-text machine translation model that supports 200 languages, which is integrated into Wikipedia as one of the translation providers.
The company also shared a demo of Universal Speech Translator, which was the first direct speech-to-speech translation system for Hokkien, a language that doesn’t have a widely used writing system.
The company also revealed Massively Multilingual Speech that provides speech recognition, language identification and speech synthesis technology across more than 1,100 languages.