Meta: Meta launches AI-powered speech translation model, to use it on WhatsApp, Facebook

Meta has announced that it is releasing an AI model that can translate and transcribe speech in up to 100 languages. The model can be useful to communicate and understand information in language that people don’t know.
“Today we’re releasing SeamlessM4T, a new multimodal AI model that lets people who speak different languages communicate more effectively,” Meta CEO Mark Zuckerberg said in a post on his Instagram Channel.
Zuckerberg said that the AI model can do speech-to-text, text-to-speech, speech-to-speech, text-to-text translation and speech recognition for up to 100 languages.
The company plans to integrate the AI model in translation and transcription into Facebook, Instagram, WhatsApp, Messenger and Threads.
How SeamlessM4T AI model works
According to Meta, the model supports speech recognition in up to 100 languages, however, the number is less when it comes to text-to-speech translation.

  • Speech recognition for nearly 100 languages
  • Speech-to-text translation for nearly 100 input and output languages
  • Speech-to-speech translation, supporting nearly 100 input languages and 36 (including English) output languages
  • Text-to-text translation for nearly 100 languages
  • Text-to-speech translation, supporting nearly 100 input languages and 35 (including English) output languages

“In keeping with our approach to open science, we’re publicly releasing SeamlessM4T under a research licence to allow researchers and developers to build on this work. We’re also releasing the metadata of SeamlessAlign, the biggest open multimodal translation dataset to date, totaling 270,000 hours of mined speech and text alignments,” the company said.
According to Meta, SeamlessM4T builds on previous advancements in this space, such as last year’s No Language Left Behind (NLLB), a text-to-text machine translation model that supports 200 languages, which is integrated into Wikipedia as one of the translation providers.
The company also shared a demo of Universal Speech Translator, which was the first direct speech-to-speech translation system for Hokkien, a language that doesn’t have a widely used writing system.
The company also revealed Massively Multilingual Speech that provides speech recognition, language identification and speech synthesis technology across more than 1,100 languages.

Source link

About manashjyoti

Check Also

Hiring Trends: Placement season starts at IITs: 4 biggest hiring trends and what they mean for jobs season at IITs

Students of the Indian Institutes of Technology (IITs) in Delhi, Bombay, Kanpur, Madras, Kharagpur, Roorkee, …

Leave a Reply

Your email address will not be published. Required fields are marked *