Attention-Enhanced Deep Learning for Urban Environmental Sound Classification
Main Article Content
Abstract
Acoustic environmental noise classification is an important task for reliable and intelligent acoustic systems. Furthermore, it is still challenging due to non-stationary nature, overlapping acoustic pattern, background interference and availability of limited tagged dataset of environmental noise. This paper presents a robust approach for urban environmental noise classification with the help of deep learning techniques using UrbanSound8K dataset. This proposed approach uses logarithmic-Mel spectrograms with data augmentation and Convolutional Neural Network that is enhanced by attention mechanisms which captures discriminative time-frequency features. To enhance computational efficiency and reproducibility, features were pre-computed and cached without altering the learning process. Official 10-fold cross-validation protocol is used to stop data leakage and ensure comparison with previous studies. Experimental results show an average classification efficiency of about 91.14%, outperforming traditional CNN baseline models while the light weight architecture is useful in many practical deployments. Detailed analysis of each class further unfolds the good recognition accuracy of impulsive and harmonic acoustic sounds, furthermore highlighting the challenges in recognition of similar categories of acoustic signal. The result indicates that CNN combined with efficient preprocessing gives a practical and scalable real time urban sound monitoring model. In future, use of large scale dataset can further enhance the performance.