Last night at Re/Code’s Code Conference, Microsoft announced a real-time, multilingual translation beta called Skype Translate. The service can mediate between two video chatters who speak different languages by providing text and audio translation after each person finishes speaking. Currently the service works for English and German, but Microsoft says that it will support other languages soon. The beta will be released later this year.
Realistically, Skype Translate won’t be ready to take over for interpreters at the U.N. anytime soon. But the fact that it’s coming from a large company like Microsoft and being tested on a really mainstream product like Skype shows that we’re actually making progress toward a Star Trek-like universal translator—and that’s amazing.
There’s been tons of research into things like automatic speech-to-text transcription, textual translation (like Google Translate and Bing Translator), and digital speech recognition. But these are difficult machine learning problems to solve. The big issue is that accuracy needs to be near 100 percent for these services to actually be useful. For example, a recent DARPA speech-to-speech translation program called TransTac achieved 80 percent accuracy, which was a fascinating and significant step, but still not enough for regular use. There are also consumer apps like Vocre that offer services very similar to Skype Translate.
The difference here is that Microsoft is behind this service. Sure, the company may be struggling with its vision for tablets and other next-generation products, but it’s still a major player, and Skype is pretty ubiquitous. Whether Skype Translate is a hit right away or, more likely, it takes years to refine, Microsoft is being bold in bringing the technology to the mainstream now. It’s a good thing for tech, but maybe not for Microsoft. The company had better be sure that it can deliver because Skype Translate’s debut will probably fuel competition, and someone else may end up making a better product that takes off. So what? The end result for the consumer will be real-time video chat translation that works. And that’s all we want—we don’t care who we get it from.