DeepMind Google Acquisition Goog AI Artificial Intelligence Robots
DeepMind, a company acquired by Google, uses deep learning techniques to program computers to learn from visual data, much like the human brain. DeepMind.com

Scientists from the Oxford University along with Google’s DeepMind have developed an artificial intelligence system that can lip-read better than humans. The system was trained by thousands of hours of BBC news programs, the media outlet said Friday.

The system, called "Watch, Attend and Spell," can correctly lip-read 50 percent of silent speech correctly, while professional lip-readers only got 12 percent right, researchers found.

Read: Google AI Firm DeepMind Develops 'Streams' App to Help UK Doctors With Patients

Some words that rhyme, such as like mat, bat and mat, have similar mouth shapes. However, it’s context is what helps lip-reading, Joon Son Chung from the university’s Department of Engineering said.

The system learns “things that come together, in this case the mouth shapes and the characters and what the likely upcoming characters are," explained Joon.

After reviewing 118,000 sentences in BBC clips, the system was able to store 17,500 words in its vocabulary. Since the system is trained on news programs, it’s good at understanding what words come after others, such as “Prime” followed by “Minister” and “European” with “Union.” However, the system is not as good at recognizing words not spoken by newscasters.

The system still needs more work before it can be used in real-time, but it’s still a good sign for those with difficulties hearing.

"AI lip-reading technology would be able to enhance the accuracy and speed of speech to text," Jesal Vishnuram from the charity Action on Hearing Loss told the BBC. "This would help people with subtitles on TV, and with hearing in noisy surroundings."

Read: Elon Musk, Bill Gates Warn About Robots Taking Over Jobs, But Study Says People Aren't Worried

In the future, the system could potentially help people dictate instructions to their smartphones in noisy places or dub old silent films.