"In typical environments there is background noise and reverberations that make it hard to carry on a cellphone conversation," says lead researcher Professor Parham Aarabi of U of T's Edward S. Rogers Sr. Department of Electrical and Computer Engineering. "This system employs two microphones that, just like the two human ears, focus on the speaker's voice and filter out other noises."
The system uses time-frequency filters to determine the speaker of interest's location based on the length of time it takes for the most intense sound to arrive at the microphones. As the two microphones observe the speaker's voice, a computer chip continuously decides which frequencies belong to the speaker and which ones to the extraneous noise. The interference is then "damaged" and the volume is scaled back.
"Other speech recognition systems only reduce the background noise, but this technology also deconstructs other conversations into a slight hum so they don't confuse you," says Aarabi, who holds the Canada Research Chair in Multi-Sensor Information Systems. "By using this approach we've been able to get 30 per cent gains in recognition accuracy over alternative state-of-the-art, multi-microphone speech recognition systems."
While the dual microphone system is currently too bulky to fit into cellphones, Aarabi predicts that a miniaturized version is only about two years away. A customized chip that enhances voice recognition software in PCs is only months away. The eventual miniaturized version will be a pen-sized device with two or four microphones and with all the batteries and electronics contained inside. The research appears in a study published in the August issue of IEEE Transactions on Systems, Man, and Cybernetics Part B.
CONTACT: Professor Parham Aarabi, Edward S. Rogers Sr. Department of Electrical and Computer Engineering, 416-946-7893, firstname.lastname@example.org or Karen Kelly, U of T public affairs, 416-978-0260, email@example.com