| Title: |
BAST-Mamba: Binaural Audio Spectrogram Mamba Transformer for binaural sound localization |
| Authors: |
Kuang, Sheng; Shi, Jie; van der Heijden, Kiki; Mehrkanoon, Siamak; Sub Algorithmic Data Analysis |
| Publication Year: |
2025 |
| Subject Terms: |
Binaural integration; Sound localization; Transformer; Computer Science Applications; Cognitive Neuroscience; Artificial Intelligence |
| Description: |
Accurate sound localization in reverberant environments is essential for human auditory perception. Recently, Convolutional Neural Networks (CNNs) have been used to model the binaural human auditory pathway. However, CNNs face limitations in capturing global acoustic features. To address this issue, we propose a novel end-to-end Binaural Audio Spectrogram Mamba Transformer (BAST-Mamba) model to predict sound azimuth in both anechoic and reverberant conditions. We explore two implementation modes: BAST-Mamba-SP and BAST-Mamba-NSP, which correspond to shared and non-shared parameter configurations, respectively. Our best model BAST-Mamba-SP, equipped with subtraction-based interaural integration and a hybrid loss function, achieves a state-of-the-art angular distance (AD) error of 0.89°and mean squared error of 0.0004, significantly outperforming baseline models. The model demonstrates generalization across acoustic environments, robust hemifield symmetry and high accurate real-time localization performance ( |
| Document Type: |
article in journal/newspaper |
| File Description: |
application/pdf |
| Language: |
English |
| ISSN: |
0925-2312 |
| Relation: |
https://dspace.library.uu.nl/handle/1874/477755 |
| Availability: |
https://dspace.library.uu.nl/handle/1874/477755 |
| Rights: |
info:eu-repo/semantics/OpenAccess |
| Accession Number: |
edsbas.1EEC9E84 |
| Database: |
BASE |