This multi-lingual speech-to-text app can identify accents and voice modulation even in noisy places

Businesses can use Liv.ai to create digital assistants in regional languages, transcribe audio/video files, set up voice-based IVR and do intelligent speech analytics

As a journalist, I mostly do my interviews over phone. This is not because I am too lazy to go out and meet people, but in a city like Bangalore, which is notorious for massive road traffic, this is the best option to save time and money.

However, phone interviews have its own challenges — a key challenge being that you need to spend a considerable amount of time transcribing it. A half-an-our interview typically takes two to three hours to transcribe, and it is a boring job, too. The same holds true for face-to-face interviews, where you need to record it and transcribe later.

Not any more. A Bangalore-based Artificial Intelligence (AI) startup has come up with a speech recognition app that can convert your speeches to text format as you speak. Unlike other similar apps in the market like Speechnotes and ListNote or even Google’s app, Liv.ai can support multiple Indian languages and identify various accents/dialects and voice modulation.

Also Read: Artificial Intelligence can democratise healthcare access in India, says Manish Singhal of Pi Ventures

According to the trio, the app is designed to work amidst background noise without loss of data, which makes sure that maximum content is absorbed with minimal errors.

“Liv.ai is about giving voice to a billion people using our deep technical expertise,” Co-founder and CEO Subodh Kumar tells me. As an alternative to typing, people can now use their voice-to-text chat on apps like Facebook and WhatsApp by speaking in their own language.”

Liv.ai is more than just speech-to-text app

Liv.ai was founded in 2015 by three IIT Kharagpur batch mates Subodh Kumar, Dr Sanjeev Kumar, and Kishore Mundra. Subodh, an alumnus of IIM-Bangalore, had stints in Citi and Microsoft. Sanjeev, who has a PhD from the University of California, had worked with Qualcomm and Avaak, while Mudra worked with Broadcom and Samsung before starting the new venture.

In addition to its B2C app LivSpeech, Liv.ai provides speech API/SDKs to enable developers to convert speech-to-text by using Neural Network Models with accuracy and minimal latency. Businesses can use this to create digital assistants in regional languages, transcribe audio/video files, set up voice-based IVR and do intelligent speech analytics. The APIs can be integrated with applications across devices, including mobile phones, tablets, PCs, TVs, speakers, set-top boxes and cars.

Liv.ai recognises eight major Indian languages — Hindi, Bengali, Punjabi, Marathi, Gujarati, Kannada, Tamil and Telugu — in addition to English.

(L-R) Liv.ai co-founders Subodh Kumar, Sanjeev Kumar and Kishore Mundra

Potential business opportunities

  • Voice typing: These days, when all the phones and computers come equipped with keyboards in different languages, typing in different languages is a big hassle. It also takes more time for people who are not very familiar with computer keyboards. Liv.ai technology allows Android users to talk to their phone in their own languages and convert that to text. Users can use voice typing to chat on various chat applications, write emails, search out products on different applications, etc.

  • Banking customer care automation: While every other bank has its apps and automated systems to do online transactions and form filling, it is still difficult for people to operate these apps manually. Language problems and inability to operate smartphones is a big issue, especially in rural regions. Now if you have to speak and get your forms filled, or do online transactions by simply giving commands to your phones in your own native language. Liv.ai allows users to do all that stuff in just one go. For example, if you have to transfer INR 5,000 to your friend, you can just command your phone and the money is transferred. If you have to fill any banking form online, you can simply speak instead of typing and the job is done. More importantly, it cut the whole process to one-third the time it takes to type or do the whole procedure manually.

  • Education: Liv.ai can be used for recording lectures, dictating notes, language evaluation and learning. It is useful for the blind because they can just speak and it will be converted to text during exams or submitting research work.

  • In-car entertainment: It is unsafe and difficult to touch the music system or device while driving to change the playlist, increase/decrease the volume, etc. Liv.ai’s technology helps car companies by providing them with intelligent devices where driver can now just give commands (again in their own language) to their music systems and the device will act intelligently and get the job done without touching it.

  • Intelligent devices: Liv.ai helps develop intelligent devices that act through the commands of a person. For instance, if you have to switch on the fan at your home, you no longer need to put your efforts in getting from the bed and switching the button. With the device’s intelligence is being powered by Liv.ai, you can now just give a command to switch on the fan in your language and it will switch it on automatically.

“Liv.ai’s software is used across e-commerce enterprises and government utilities, besides companies with a consumer interface. The government has been using our platform to get feedback from citizens and call centre companies for voice search. The app helps reduce time and labour costs as the answers are transcribed and are ready for analysis,” says Subodh.

Liv.ai has already raised an undisclosed sum in investment from investors in India and abroad. Subodh says the company is looking to onboard like-minded investors who can add value to it, in addition to helping it accelerate growth. As per a VCCircle report, Liv.Ai is already raising a fresh round of investment from VC firm Astarc Ventures.

Subodh believes that there are massive opportunities in AI domain. “There are so many tasks that we find mundane or machine interfaces that are unnatural or unintuitive. With recent advancements in AI, we can create significant impact in both areas. Our mission is to free up the human minds by automating the routine tasks and to increase the human productivity in complex tasks. We want to make machines more humane so that our communication with them is as natural and stress free as possible,” he concludes.

Image Credit: olgalebedeva / 123RF Stock Photo

The post This multi-lingual speech-to-text app can identify accents and voice modulation even in noisy places appeared first on e27.