The Next Evolution in Voice Technology

January 23, 2021

With natural language processing, voice-to-speech technology becomes a reality, connecting people in an almost conversation-like fashion. But current voice-based computers don’t anticipate the complex subtleties of language.

Though Siri has the ability to recognize speech patterns, her voice itself mimics real human conversation as if it were an impromptu phrase of sentences thrown out to the wind from one person to another. True and useful, this technology could change the way humans communicate, but within a human-computer environment, Siri is still far away from realistic, child-like speech.

The speech-to-speech technology of the future will follow the principles and predictions of artificial intelligence: it will learn from the existing language, even the complex complexities, and comprehend them and base its own speech on it.

This technology is similar to the human language in its concepts and styles. When person and computer communicate through human-language conversations, we intuitively figure out how to communicate and avoid conflict. In this case, human-language languages are the foundations of the technology. We interpret the human language in our English and other languages, not the other way around. We listen to the communication and identify the correct grammar and vocabulary without having to practice for a long time.

Within the next ten years, with the growing advancement of this kind of technology, it is possible that this kind of communication will be human-like in its meaning and the quality of its communication, with machine-like voice and machine-like grammar.