One of the main challenges for people with hearing loss is understanding speech in noisy surroundings. The problem is referred to as the cocktail party effect because situations where many people are talking at the same time often make it very hard to distinguish what is being said by the individual you are talking to.
Even though most modern hearing aids incorporate various forms of speech enhancement technology, engineers are still struggling to develop a system that makes a significant improvement.
PhD student Mathew Kavalekalam from the Audio Lab Analysis at Aalborg University is using machine learning to develop an algorithm that enables a computer to distinguish between spoken words and background noise. The project is done in conjunction with hearing aid researchers from GN Advanced Science and is supported by Innovation Fund Denmark.
Computer listens and learns
“The hearing centre inside our brains usually performs a string of wildly complicated calculations that enables us to focus on a single voice – even if there are many other people talking in the background,” explains Mathew Kavalekalam, Aalborg University. “But that ability is very difficult to recreate in a machine.”
Mathew Kavalekalam started out with a digital model that describes how speech is produced in a human body, from the lungs via throat and larynx, mouth and nasal cavities, teeth, lips, etc.
He used the model to describe the type of signal that a computer should ’listen’ for when trying to identify a talking voice. He then told the computer to start listening and learning.
Noise isn’t just noise
“Background noise differs depending on the environment, from street or traffic noise if you are outside to the noise of people talking in a pub or a cafeteria,” Mathew Kavalekalam says. “That is one of the many reasons why it is so tricky to build a model for speech enhancement that filters the speech you want to hear from the babbling you are not interested in.”
At Aalborg University Mathew Kavalekalam played back various recordings of voices talking to the computer and gradually added different types of background noise at an increasing level.
By applying this machine learning, the computer software developed a way of recognising the sound patterns and calculating how to enhance the particular sound of talking voices and not the background noise.
Fifteen percent improvement
The result of Kavalekalam’s work is a piece of software that can effectively help people with hearing loss better understand speech. It is able to identify and enhance spoken words even in very noisy surroundings.
So far the model has been tested on ten people who have been comparing speech and background noise with and without the use of Kavalekalam’s algorithm.
The test subjects were asked to perform simple tasks involving colour, numbers and letters that were described to them in noisy environments.
The results indicate that Kavalekalam may well have developed a promising solution. Test subjects’ speech perception improved by fifteen percent in very noisy surroundings.
Snappy signal processing
However, there is still some work to be done before Mathew Kavalekalam’s software finds its way into new hearing aids. The technology needs to be tweaked and tuned before it is practically applicable.
The algorithm needs to be optimized to take up less processing power. Even though technology keeps getting faster and more powerful, there are hardware limitations in small, modern hearing aids.
“When it comes to speech enhancement, signal processing needs to be really snappy. If the sound is delayed in the hearing aid, it gets out of sync with the mouth movements and that will end up making you even more confused,” explains Mathew Kavalekalam.
- One in six Europeans experiences various degrees of hearing impairment. Almost everyone loses part of their hearing as they age.
- Hearing loss often manifests itself in problems when trying to participate in conversations with more than one person talking. This can lead to isolation as people with hearing loss often choose to withdraw from social gatherings where they have to spend a lot of energy trying to keep up with what is being said.
Mathew Kavalekalam, PhD student, Audio Analysis Lab, Department of Architecture, Design, and Media Technology, Aalborg University: email@example.com, +45 99 40 26 62
Mads Græsbøll Christensen, Professor, Audio Analysis Lab, Department of Architecture, Design, and Media Technology, Aalborg University: firstname.lastname@example.org, +45 99 40 97 93
Hiva Ahmadi, Press Contact, Aalborg University, email@example.com, +45 22 20 68 69