We were challenged to create two applications, which enable translation of any sign by pointing the phone camera at it and speech-to-speech translation.
Imagine being able to speak another language without having to learn it. VoIP translation app is making it possible. In a nutshell, you can have a conversation just like normal, and an app will translate what you say into the other person’s language in “near real-time.” Then, when the other person says something, it will be translated back to your language.
So, how does VoIP translation app work? Get ready to dive in.
The first app is a crowdsourcing translation application that covers more than 20 languages. This application goes beyond the basic functions and allows three different input methods: keyboard, voice and camera. Users can set different settings and save all searched words or phrases in a history panel in order to return to them in the future.
The app runs on iOS, Mac, Windows, Windows Phone and Android.
The app also includes a real-time chat translation with 4 modes:
The second application is a phone conversation translator that seamlessly translates foreign speech and interprets it into the respondent’s native language. If you are calling a person using a VoIP translator application, the respondent only hears the translated speech. The application is only required on the caller’s phone for it to operate. You don’t have to have it on both devices. You can call both mobile and landline numbers.
In the video below, you can see how the text-to-text translation app works.
In addition, the application can be installed on Amazon and Google Smart Columns. Say “Hey Google, translate through ‘program name’ and the column will translate a phrase or word to you.
To recognize and translate text, we have applied the Tesseract open source OCR engine. It worked well almost out of the box, and we did not spend much time on development. But we had some issues with specific letters recognition, so we had to train Tesseract how to read these glyphs properly.
Neural machine translation, or NMT in short, is the use of neural network models to learn a statistical model for machine translation.
The key benefit to the approach is that a single system can be trained directly on the source and target text, no longer requiring the pipeline of specialized systems used in statistical machine learning.
For speech translation, we have used Open NMT, a full-featured, open-source (MIT) neural machine translation system. For in-depth training and scientific calculations, Open NMT utilizes the free Torch mathematical toolkit through the Lua language. Torch allows you to utilize the capabilities of the GPU to accelerate the process of learning a neural network. The extension system allows you to implement additional functionality based on Open NMT.
We developed translation models by training a neural network based on a reference set of translations. Two files were transferred for learning the system — one with sentences in the source language and the second with a high-quality translation of these sentences into the target language.
We also used the Open NMT optical text recognition system, capable of recognizing and transforming complex mathematical formulas into the LaTeX format.
Whether you already work in an industry or you’re a newcomer, you can increase your traction by expanding your sphere of operations with the help of ML.
Have a plan to develop a VoIP translation app? We at VironIT, a software development company, are here to help you launch superior quality apps that will take your business to the next level.