Ishikawa, Japan -- The ability to locate sounds in the surrounding environment is a remarkable feature of the human ear. Typically, people with good hearing use both ears to detect and interpret auditory cues. Differences in the loudness or arrival time of sounds at each ear provide us with vital information about the location and direction of the sound source. Interestingly, however, studies have suggested that while binaural cues are sufficient for sound localization, they are not necessary. People with monaural hearing (hearing loss in one ear) can perceive sound location as well.
Fortunately for engineers, this can help remove limitations on the design and positioning of audio recording devices and microphone arrays. Used for source localization and noise reduction, microphone arrays need to be placed at specific intervals and positions to effectively capture and analyze sound from different directions. To avoid poor sound quality resulting from inadequate microphone array design or positioning, the capability to estimate the sound direction using monaural cues is highly desirable as it can help simplify microphone array designs.
In a study made available online on 13 January 2023 and published in the journal Applied Acoustics on 28 February 2023, Prof. Masashi Unoki and his colleagues from the Japan Advanced Institute of Science and Technology (JAIST) and Toyama Prefectural University, Japan, have proposed a method that uses monaural cues to estimate the direction-of-arrival (DOA) of sound signals in three dimensions. “In our work, we propose an estimation method based on monaural modulation spectrum (MMS), which relies on modulation in the frequency spectrum of the received signal to detect the signal DOA. This can help us develop monaural cues for single-channel signal processing,” explains Prof. Unoki.
To determine the monaural DOA, the team simulated sound signals from different directions using artificial amplitude modulation noise and human speech signals while accounting for the effect of the ears, torso, and head in filtering sound. Next, they obtained the MMS of the signals describing their frequency modulations to identify key features that could be tied to the DOA of the signals. To avoid monaural front-back confusion, which occurs when sound sources at the same angle in front of or behind the listener can produce the same estimates for the DOA, the researchers considered the effect of head movement on the MMS to realize a more accurate DOA estimation.
Using the known DOA and the features of the MMS as training data, they then constructed a polynomial regression model that estimated the DOA from the MMS features of the sound signal in terms of the horizontal and the vertical direction of the listener. The model could accurately estimate the DOA of 829,440 speech signals, outperforming even human monaural hearing.
While the team qualifies their findings by suggesting that there is more work to be done to account for background noise and individual differences in ear shape when creating the model, the study demonstrates an impressive advancement in monaural sound localization. Speculating about its implications, the researchers envision their technology’s applications in sound surveillance techniques and hearing aid enhancements. “Our study will help reveal our ability to localize sounds based on monaural hearing, which, in turn, could stimulate various innovations in hearing aid techniques in the long-term,” concludes Prof. Unoki.
Title of original paper:
Method of estimating three-dimensional direction-of-arrival based on
monaural modulation spectrum
Rui Wang, Nguyen Khanh Bui, Daisuke Morikawa, and Masashi Unoki*
About Japan Advanced Institute of Science and Technology, Japan
Founded in 1990 in Ishikawa prefecture, the Japan Advanced Institute of Science and Technology (JAIST) was the first independent national graduate school in Japan. Now, after 30 years of steady progress, JAIST has become one of Japan’s top-ranking universities. JAIST counts with multiple satellite campuses and strives to foster capable leaders with a state-of-the-art education system where diversity is key; about 40% of its alumni are international students. The university has a unique style of graduate education based on a carefully designed coursework-oriented curriculum to ensure that its students have a solid foundation on which to carry out cutting-edge research. JAIST also works closely both with local and overseas communities by promoting industry–academia collaborative research.
About Professor Masashi Unoki from Japan Advanced Institute of Science and Technology, Japan
Masashi Unoki is Dean of the School of Information Science at Japan Advanced Institute of Science and Technology (JAIST). He received his M.S. and Ph.D. in Information Science from the JAIST in 1996 and 1999, respectively. His main research interests include auditory motivated signal processing and the modeling of auditory systems. He was associated with the ATR Human Information Processing Laboratories as a visiting researcher from 1999-2000, and was a visiting research associate at the Centre for the Neural Basis of Hearing (CNBH) in the Department of Physiology at the University of Cambridge from 2000-2001. He has been a faculty of the School of Information Science at JAIST since 2001 and is a full professor.
This study was supported by the Fund for the Promotion of Joint International Research (Fostering Joint International Research (B)) (20KK0233) and JSPS-NSFC Bilateral Joint Research Projects/Seminars (JSJSBP120197416). This study was also supported by SCOPE Program of Ministry of Internal Affairs and Communications (Grant No.: 201605002).
Method of estimating three-dimensional direction-of-arrival based on monaural modulation spectrum
Article Publication Date