(Former title: computer speech recognition rate for the first time comparable to humans)
Beijing, October 28 (Reporter Jiang Jing), according to the MIT website recently reported that Microsoft Raymond Institute developed a machine learning algorithm to enable the computer to the specified theme dialogue speech recognition rate increased to 94.1% , For the first time with the level of human beings; relatives and friends of the daily conversation recognition rate of 88.9%, or even slightly better than human.
The National Institute of Standards and Technology (NIST) published a database in 2000 to help solve the problem of speech recognition. The database contains some of the telephone recording is the established topic of conversation between the individual, the rest is the casual conversation between friends and relatives.
The results show that human transcription in the error rate of about 4%, that is, every hundred words in the human error transcription of four words. In the past, the performance of the machine far from this figure. Nowadays, when the computer transcribes the conversational content of a given topic to a text, the error rate is 5.9%, and the error rate is 11.3% when transcribing random conversation between friends and relatives on any topic. “This is even better than expected,” Microsoft researcher Zweig said.
Zweig then optimized their own depth learning systems based on different layers of convolutional neural networks, allowing each layer of the system to recognize different aspects of speech. They then set up the machine with training data as a standard in order to identify common speech and allow them to adapt to the test database.
In general, Microsoft’s speech recognition system has a similar error rate to humans, but it causes a very different type of error from that of humans. The most common mistake of Microsoft machines is to confuse the feedback sound. In contrast, humans rarely make such mistakes. In this regard, Zweig think, in principle, the machine can not be trained to identify the reasons for feedback sound, the error may be noise in the training data set on the way the mark.
Microsoft researchers say that computer speech recognition is surpassing the human level, “the importance of the computer industry as much as the graphical user interface,” which includes both consumer entertainment devices such as Xbox, including real-time voice to text, etc. accessibility Tools, as well as “Xiaona” personal digital assistant.
Original link: http://tech.163.com/16/1029/11/C4HSLV4A00097U7T.html