Information

Google's AI Now Translates Your Speech In Your Exact Voice

Google's AI Now Translates Your Speech In Your Exact Voice


We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

At some point or another audio translations have had to be used and in those times the distinction between the voice of the translation and the original one is highly noticeable. The most obvious change is the swap from a male voice to a female one, or vice versa.

Google's translation team has been working hard to minimize the audio changes, and its audio translator can now keep the voice and tone as close as possible to that of the original speaker.

RELATED: GOOGLE AIMS TO HELP PEOPLE WITH IMPAIRED SPEECH TO LIVE INDEPENDENTLY 

There are still some noticeable, yet distinctly smaller, differences. These have been dramatically minimized in comparison to other translation engines.

How does it all work?

Google's AI translator directly converts the audio input to the audio output without any further in-between steps.

Traditionally speaking, translation systems convert audio into text, the text is then translated, and finally, the audio is resynthesized. Somewhere in the middle, the original voice is lost and a new, distinctly different, one is used in its stead.

What Google has done is to create and use a new system, named the 'Translatotron', an end-to-end speech-to-speech translation system. The Translatotron comprises three steps:

  1. Audio spectrograms from input languages into output ones trained to map each other.
  2. A conversion of spectrograms into an audio wave.
  3. The third component layers the original speaker's voice back onto the final output.

What difference will this make?

This is a positive tick in the box for all matters linked to audio translation, not only due to the fact it creates more nuanced translations but because it also minimizes room for errors. As there are fewer steps in the translation process, there are fewer chances for mistakes to happen.


Watch the video: Computer speech - what voice should the machine have? An experiment in speech synthesis. (July 2022).


Comments:

  1. Tojas

    I recommend that you go to the site, where there is a lot of information on the topic that interests you.

  2. Blade

    Sorry, I thought about it and deleted this phrase

  3. Konnyr

    Yes, really.

  4. Faemi

    The interesting topic, I'll take part.



Write a message