Have you ever tried to book an airline reservation over the phone only to hang up in frustration because the automated voice recognition system couldn't recognize what you were saying? Well, researchers at Binghamton University are working to improve the software used in those applications. As our Janelle Burrell tells us, they just got a financial boost to expand their studies.
VESTAL, N.Y. -- It's a common frustration of navigating any automated voice recognition system: Talking to a computer that just doesn't understand you.
"I think the average three-year-old child is better at speech recognition than most computers programs," said Professor Stephen Zahorian, chair of Binghamton University's Electrical and Computer Engineering department.
But Dr. Stephen Zahorian and his team are on a mission to change that. The Binghamton University professor has spent the last 25 years studying speech recognition technology and will be furthering his research, thanks to a half-a-million dollar grant from the Air Force Office of Scientific Research.
"One of the big difficulties with automatic speech recognition is it does not work particularly well with conversation, with conversation like this. So we look at speech in a so-called time frequency plane and then extract what we call features from that which have been show with more studio quality," Zahorian said.
The research team actually downloads videos from You Tube that they put into a special software, transcribe and then use as analysis in their research.
"We are collecting different videos in different languages like English, Mandarin and Russian," said Chandra Vootkuri, a graduate researcher in Zahorian's lab.
By converting the sounds into numerical algorithms, they hope to use the grant to not only improve recognition software, but predict speech as well.
"It's very exciting, interesting, useful to have large chuck of funding to move this forward," Zahorian said.
Researchers say their dream is to one day create software where a person could speak into a device in one language and have the device translates and repeats the phrase in another.