AI speech recognition software reaches level of parity with human transcribers

Microsoft recently reached a new milestone in speech recognition: its system achieved a 5.1% error rate. In other words, this new technology recognizes words in a conversation as well as professional human transcribers. Reaching human parity has been a research goal for the last 25 years.

This great achievement comes only one year after the Redmond giant passed another milestone: researchers built a speech recognition system that had a word  error rate of 5.9%. A few months later, Microsoft’s Artificial Intelligence and Research department managed to further improve on these results, reducing the word error rate even more.

Better neural net-based acoustic and language models

Researches managed to achieve these results by improving the neural net-based acoustic and language models already used. More specifically, they introduced an additional convolutional neural network combined with bidirectional long-short-term memory model for improved acoustic modeling. At the same time, the new speech recognition system combines predictions from multiple acoustic models both from the frame and word levels.

Moreover, we strengthened the recognizer’s language model by using the entire history of a dialog session to predict what is likely to come next, effectively allowing the model to adapt to the topic and local context of a conversation.

Microsoft is planning to incorporate this new technology into its products and services such as Cortana, Presentation Translator, and Microsoft Cognitive Services.

Better human-machine interaction

Of course, this achievement opens the door to new challenges. Speech recognition systems don’t work very well in noisy environments with distant microphones or in recognizing accented speech.

Researchers still have a long way to go in teaching computers to understand word meaning and intent. Indeed, the next major frontier for speech recognition software and artificial intelligence is understanding of intent and meaning.

When researchers will have reached this milestone, we will be able to dream of creating human-like robots. Until then, our robots will only have a human-like physical appearance, but they won’t be able to interact with us in the same manner other humans do.

YOU MAY ALSO LIKE: Replika is an AI chatbot that you can use to create your own virtual clone

Follow The AI Center on social media:

Mary Blaut

I strongly believe that Artificial Intelligence is the future of technology. AI research has yielded significant advancements in recent years and this is only the beginning.

Join me as I track the latest progress in AI research.
About Mary Blaut 57 Articles
I strongly believe that Artificial Intelligence is the future of technology. AI research has yielded significant advancements in recent years and this is only the beginning. Join me as I track the latest progress in AI research.