Filipino is among the four languages that have presented quite a challenge to the engineers behind Google's Translate app, the Internet giant admitted Friday.
Eugene Weinstein and Pedro Moreno of the Google Speech Team said Google needed creative solutions to crack the four tongues.
"Although we’ve been working on speech recognition for several years, every new language requires our engineers and scientists to tackle unique challenges. Our most recent additions - Croatian, Filipino, Ukrainian, and Vietnamese - required creative solutions to reflect how each language is used across devices and in everyday conversations," they said in a blog post.
In the case of Filipino, they said the tongue particularly presented "interesting challenges" since Filipinos often mix several languages in daily life.
Such a practice, called code switching, "complicates the design of pronunciation, language, and acoustic models," they said.
The engineers eventually decided to reflect the "reality of daily language use in our speech recognizer design."
"If users mix several languages, our recognizers should do their best in modeling this behavior. Hence our Filipino voice search system, while mainly focused on the Filipino language, also allows users to mix in English terms," they said.
Meanwhile, they said they had to take tones into consideration in Vietnamese.
One simple technique is to model the tone and vowel combinations directly in Google's lexicons.
"As a result we had to come up with special algorithms to handle the increased complexity. Additionally, Vietnamese is a heavily diacritized language, with tone markers on a majority of syllables," they said.
The solution was a special diacritic restoration algorithm "which enables us to present properly formatted text to our users in the majority of cases," they said.
Weinstein and Moreno said they use Google's distributed large-scale neural network learning infrastructure - the one that learned to spontaneously discover cats on YouTube.
"By partitioning the gigantic parameter set of the model, and by evaluating each partition on a separate computation server, we’re able to achieve unprecedented levels of parallelism in training acoustic models," they said.
However, they said there must also be more people using Google speech recognition products, so the technology will become more accurate.
"These new neural network technologies will help us bring you lots of improvements and many more languages in the future," they said. — TJD, GMA News