Making sense of a jumble of sounds
Tuomas Virtanen is involved in a project that explores acoustic pattern recognition in everyday environments. As the goal is to develop methods for identifying simple sounds, such as car sounds, the participants have taken their microphones outdoors.
Understanding speech, not to mention foreign speech, amid a cacophony of competing noises is challenging for us all but especially for the elderly and people with hearing loss. Researchers are developing solutions that facilitate communication in noisy environments.
Academy Research Fellow Tuomas Virtanen became interested in audio research as a teenager while composing and processing music with a computer. After enrolling at TUT to study information technology, he noticed that signal processing is an exciting field of study for students with a flair for mathematics. He eventually joined the Audio Research Team in the Department of Signal Processing and started investigating acoustic pattern recognition and the quality of speech signals.
The purpose of acoustic pattern recognition is to identify the content of speech and other sounds. The technology is used, for example, in the automatic speech recognition software built into mobile phones and while searching music and multimedia databases for files containing a specific sound.
Researchers join forces with companies and hospitals
Virtanen leads the international, EU-funded INSPIRE (Investigating Speech Processing in Realistic Environments) project at TUT. The four-year project, due to end in 2015, explores speech communication in real-world conditions.
The INSPIRE network consists of ten European research institutes, five companies developing hearing aids, cochlear implants and speech recognition technology, and two university hospitals.
WHO: Tuomas Virtanen
- 38-year-old Doctor of Science in Technology.
- On 1 January 2015, Virtanen will take up an appointment as Associate Professor (tenure track).
- Currently Academic Research Fellow in the Department of Signal Processing at TUT. Head of the Audio Research Team.
- Majored in information technology and graduated with a doctorate from TUT in 2006.
- Has developed pioneering methods for single-channel sound source separation that have since become standard tools for the processing of audio signals. His research interests focus on music information retrieval and the development of technology that can help people, such as people with hearing problems.
- Family: common-law wife and two young sons.
- Hobbies: jogging, music, playing and singing for the children.
Factors such as hearing impairments, background noise and the language conspire to affect the intelligibility of speech in everyday communication settings. The INSPIRE project brings together a number of partners and enables the simultaneous analysis of multiple factors affecting speech quality.
“Researchers at TUT are responsible for the development of speech separation methods based on machine learning and computational models for human hearing,” Virtanen says.
“We’re mimicking the capabilities of the human auditory system in an effort to develop devices that equal or even supersede the ability of the human brain to distinguish different voices. This ability is diminished in people who have a hearing impairment, which is why they need a hearing aid in noisy environments. However, hearing aids that are currently available on the market cannot identify speakers or discriminate between the sound of a voice and other noises. The methods we’re developing will improve the performance of hearing aids.”
Effective speech enhancement methods and prediction tools
The researchers under Virtanen’s supervision have already developed a new method that identifies and enhances speech signals more effectively than any other currently available system. The method is based on so-called blind sound source separation, which filters out extraneous sounds.
Their method is capable of decoding “unseen” speech that is yet to be observed by drawing generalizations from observed data.
“We’ve also developed tools for predicting which elements of speech make it difficult to understand, what kinds of words are easily missed and how background noise reduces intelligibility.”
“These predictions can be used to develop better hearing aids and acoustically optimal spaces and improve speech synthesis, which is used, among others, in the automatic announcement systems in railway stations and trains. In addition, they help the developers of voice-controlled interfaces select unambiguous voice commands. Voice-controlled interfaces are popping up in mobile phones, in-car systems and automated telephone services.”
Researchers gain a first-hand glimpse into industry
INSPIRE is one of the EU-funded Marie Curie Initial Training Networks (ITN) that offer researchers the opportunity to visit other universities and companies within the network.
Marie Curie Early Stage Researcher Thomas Barker is part of the research group headed by Tuomas Virtanen but is currently visiting the hearing aid manufacturer Oticon's Eriksholm Research Centre in Denmark, where he is working to improve the sound quality of hearing aids in high-noise environments.
“I now have perspective on the aims and goals of academia compared to industry. For example, a statistical improvement in audio signal quality might be a great result in academia, but unless it actually improves the user experience for someone listening to the output, it may not be of such importance in an industrial setting,” Barker says.
Barker is convinced that researchers benefit from time spent in companies and other universities. It broadens their perspectives and enables them to experience life in a different organization. He is writing a dissertation in signal processing at TUT.
Partners in the INSPIRE project
- Universities and research institutions: Katholieke Universität Leuven (Belgium), Philips Research Eindhoven (the Netherlands), Radboud University Nijmegen (the Netherlands), Tampere University of Technology (Finland), Technical University of Denmark (Denmark), Universidad del País Vasco (Spain), University College London (UK), University of Sheffield (UK), University of Edinburgh (UK), University of York (UK).
- Companies: Cochlear Europe (UK), Nokia /Microsoft (Finland/USA), Nuance Communications International (Belgium), Oticon Research Centre Eriksholm (Denmark), Phonak (Switzerland).
- Hospitals: Royal National Throat, Nose and Ear Hospital (UK) and Radboud University Nijmegen Medical Centre (the Netherlands).
Text: Leena Koskenlaakso
Photo: Mika Kanerva