ANALYSIS OF THE SOUND EVENT DETECTION METHODS AND SYSTEMS

Main Article Content

Andriy Kovalenko
Anton Poroshenko

Abstract

Detection and recognition of loud sounds and characteristic noises can significantly increase the level of safety and ensure timely response to various emergency situations. Audio event detection is the first step in recognizing audio signals in a continuous audio input stream. This article presents a number of problems that are associated with the development of sound event detection systems, such as the deviation for each environment and each sound category, overlapping audio events, unreliable training data, etc. Both methods for detecting monophonic impulsive audio event and polyphonic sound event detection methods which are used in the state-of-the-art sound event detection systems are presented. Such systems are presented in Detection and Classification of Acoustic Scenes and Events (DCASE) challenges and workshops, which take place every year. Beside a majority of works focusing on the improving overall performance in terms of accuracy many other aspects have also been studied. Several systems presented at DCASE 2021 task 4 were considered, and based on their analysis, there was a conclusion about possible future for sound event detection systems. Also the actual directions in the development of modern audio analytics systems are presented, including the study and use of various architectures of neural networks, the use of several data augmentation techniques, such as universal sound separation, etc.

Article Details

How to Cite
Kovalenko, A., & Poroshenko, A. (2022). ANALYSIS OF THE SOUND EVENT DETECTION METHODS AND SYSTEMS. Advanced Information Systems, 6(1), 65–69. https://doi.org/10.20998/2522-9052.2022.1.11
Section
Information systems research
Author Biographies

Andriy Kovalenko, Kharkiv National University of Radio Electronics, Kharkiv, Ukraine

Doctor of Technical Sciences, Professor, Head of the Department of Electronic Computers

Anton Poroshenko, Kharkiv National University of Radio Electronics, Kharkiv, Ukraine

postgraduate student at Department of Electronic Computers

References

Alain, Dufaux, Laurent, Besacier, Michael, Ansorge, and Fausto, Pellandini (2000), “Automatic sound detection and recognition for noisy environment”, 2000 10th European Signal Processing Conference, IEEE, pp. 1-4.

Phan, Huy, Koch, Philipp, Katzberg, Fabrice, Maass, Marco, Mazur, Radoslaw, McLoughlin, Ian and Mertins, Alfred (2017), “What makes audio event detection harder than classification?”, 2017 25th European Signal Processing Con-ference (EUSIPCO), pp. 2739-2743, doi: https://doi.org/10.23919/EUSIPCO.2017.8081709.

Sami Ur, Rahman, Adnan, Khan, Sohail, Abbas, Fakhre, Alam and Nasir Rashid (2021), “Hybrid system for automatic detection of gunshots in indoor environment”, Multimedia Tools and Applications, Vol. 80, No. 3, pp. 4143-4153, doi: https://doi.org/10.1007/s11042-020-09936-w.

Nicolas, Turpault and Romain, Serizel (2020), Training Sound Event Detection On A Heterogeneous Dataset. DCASE Workshop, Nov, Tokyo, Japan, available at: https://arxiv.org/abs/2007.03931.

J. Salamon, D. MacConnell, M. Cartwright, P. Li, and J. P. Bello (2017), “Scaper: A library for soundscape synthesis and augmentation”, Proc. WASPAA, pp. 344–348, doi: https://doi.org/10.1109/WASPAA.2017.8170052.

A. Shah, A. Kumar, A. G. Hauptmann, and B. Raj, “A closer look at weak label learning for audio events,” in arXiv:1804.09288. [Online]. available at: http://arxiv.org/abs/1804.09288.

Порошенко А.І. Методи та підходи до детектування аудіоподій різних типів [Текст] / А.І. Порошенко, А.А. Коваленко // Сучасні напрями розвитку інформаційно-комунікаційних технологій та засобів управління. Матеріали одинадцятої міжнародної НТК. – Баку: ВА ЗС АР; Харків: НТУ «ХПІ»; Київ: НАУ; Харків: ДП «ПДПРОНДІАВІАПРОМ»; Жиліна: УмЖ, 2021. – 8-9 квітня 2021. – Т.2. – С. 114.

Порошенко, А.І. Методи класифікації ознак аудіосигналів [Текст] / А.І. Порошенко, А.А. Коваленко // Проблеми інформатизації : тези доп. 9-ї міжнар. наук.-техн. конф., 18-19 листопада 2021 р., м. Черкаси, м. Харків, м. Баку, м. Бельсько-Бяла : [у 3 т.]. Т. 1 / Черк. держ. технолог. ун-т [та ін.]. – Харків : Петров В. В., 2021. – С. 90.

K. Kumar and K. Chaturvedi, "An Audio Classification Approach using Feature extraction neural network classifica-tion Approach," 2nd International Conference on Data, Engineering and Applications (IDEA), 2020, pp. 1-6, doi: https://doi.org/10.1109/IDEA49133.2020.9170702.

K. Hirata, T. Kato and R. Oshima, "Classification of Environmental Sounds Using Convolutional Neural Network with Bispectral Analysis," 2019 International Symposium on Intelligent Signal Processing and Communication Sys-tems (ISPACS), 2019, pp. 1-2, doi: https://doi.org/10.1109/ISPACS48206.2019.8986304.

Romain Serizel, Nicolas Turpault, Ankit Shah, Justin Salamon. Sound event detection in synthetic domestic environ-ments. ICASSP 2020 - 45th International Conference on Acoustics, Speech, and Signal Processing, May 2020, Barcelo-na, Spain.

Nicolas Turpault, Scott Wisdom, Hakan Erdogan, John Hershey, Romain Serizel, et al.. Improving Sound Event Detec-tion In Domestic Environments Using Sound Separation. DCASE Workshop 2020 - Detection and Classification of Acoustic Scenes and Events, Nov 2020, Tokyo / Virtual, Japan.

E. Tzinis, S. Wisdom, J. R. Hershey, A. Jansen and D. P. W. Ellis, "Improving Universal Sound Separation Using Sound Classification," ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Pro-cessing (ICASSP), 2020, pp. 96-100, doi: https://doi.org/10.1109/ICASSP40776.2020.9053921.

S. Sose, S. Mali and S. P. Mahajan, "Sound Source Separation Using Neural Network," 2019 10th International Confer-ence on Computing, Communication and Networking Technologies (ICCCNT), 2019, pp. 1-5, doi: https://doi.org/10.1109/ICCCNT45670.2019.8944614.

Zheng X., Chen H., Song Y. Zheng ustc teams submission for dcase2021 task4 semi-supervised sound event detection. – DCASE2021 Challenge, Tech. Rep, 2021.

Kim, N. K. and Kim, H. K. (2021), “Self-training with noisy student model and semi-supervised loss function for dcase 2021 challenge task 4”, available at: http://arXiv:2107.02569.

Rui, Lu, Wenzheng, Hu, Zhiyao Duan and Ji, Liu (2021), “Integrating advantages of recurrent and transformer struc-tures for sound event detection in multiple scenarios”, Proceedings of the Detection and Classification of Acoustic Scenes and Events 2021 Workshop (DCASE2021), Tech. Rep., Challenge.

Ebbers, J. and Haeb-Umbach, R. (2021), “Self-Trained Audio Tagging and Sound Event Detection in Domestic Envi-ronments”, Proceedings of the Detection and Classification of Acoustic Scenes and Events 2021 Workshop (DCASE2021), Online, pp. 15-19.

Hyeonuk, Nam, Byeong-Yun, Ko, Gyeong-Tae, Lee, Seong-Hu, Kim, Won-Ho, Jung, Sang-Min, Choi and Yong-Hwa, Park (2021), “Heavily augmented sound event detection utilizing weak predictions”, Detection and Classification of Acoustic Scenes and Events 2021 (DCASE2021), arXiv preprint arXiv:2107.03649.

Gangyi ,Tian , Yuxin, Huang, Zhirong, Ye, Shuo, Ma, Xiangdong, Wang, Hong, Liu, Yueliang, Qian, Rui, Tao, Long, Yan, Kazushige, Ouchi, Janek, Ebbers and Reinhold, Haeb-Umbach (2021), “Sound event detection using metric learn-ing and focal loss for dcase 2021 task 4”, Tech. Rep., Detection and Classification of Acoustic Scenes and Events 2021 (DCASE2021), Challengeb.