Google Translate Adds More Languages, Covers 99% of Users
With its 10th anniversary approaching, Google Translate has expanded its repertoire of languages to 103, which means it can now provide machine learning-based translations to about 99 percent of the world's online population.
Over the next few days, the translation tool will add support for 13 additional languages, ranging from Ethiopia's Amharic to South Africa's Xhosa, Sveta Kelman, senior program manager for Google Translate, announced in a blog post yesterday. The update means an additional 120 million potential users for Google Translate.
Google Translate first launched in 2006 using an early form of computer-assisted translation based on information from dictionaries, grammar guides and other sources. It has since evolved into a more advanced approach based on machine learning that is refined through online language resources and human assistance from the Google Translate Community.
Helped by 3 Million Human Translators
When Google Translate was first made available online, it covered just a small number of language pairs, including English-Arabic, English-Chinese and English-Russian. By the summer of 2009, it had grown to support 51 languages.
In yesterday's blog post, Kelman noted that Google first chooses a new language to support based on the fact that it is a written language. After that, "we also need a significant amount of translations in the new language to be available on the Web," she added. "From there, we use a combination of machine learning, licensed content and Translate Community."
To date, more than 3 million people in the Translate Community have helped to correct and improve Google's machine-based translation capabilities and their contributions cover some 200 million translated words, Kelman said.
"As we scan the Web for billions of already translated texts, we use machine learning to identify statistical patterns at enormous scale, so our machines can 'learn' the language,” Kelman said. "For each new language, we make our translations better over time, both by improving our algorithms and systems and by learning from your translations with Translate Community."
Growing Support for Spoken Translations
Besides Amharic and Xhosa, the new languages coming to Google Translate are: Corsican, Napoleon's first language; Frisian (Netherlands and Germany); Kyrgyz (Kyrgyzstan); Hawaiian; Kurmanji Kurdish (Turkey, Iraq, Iran and Syria); Luxembourgish; Samoan; Scots Gaelic; Shona (Zimbabwe); Sindhi (Pakistan and India); and Pashto (Afghanistan and Pakistan).
In other machine-learning-based translation news, Microsoft today announced that it was rolling out two new features for its mobile Microsoft Translator apps. They include a new Android-based translation engine that uses artificial intelligence and support for image-based translations using optical character recognition technology on iOS.
Both Microsoft and Google also provide conversation-based translation support, and last year Microsoft's Skype rolled out a preview version of near-real-time translations for spoken conversations in four languages: English, Spanish, Italian and Mandarin Chinese.