AUDIO EVENT ANALYSIS METHOD IN NETWORK-BASED AUDIO ANALYTICS SYSTEMS
Main Article Content
Abstract
Relevance. In the rapidly evolving field of network-based audio analytics systems, the detection and analysis of audio events play a crucial role across various applications, including security, healthcare, and entertainment. Subject. This paper examines a method for recognizing audio events in network-based audio analytics systems, including preprocessing, sound separation, and the creation of machine learning models for analyzing audio signals. Objective. The objective is to develop and improve integrated methods for analyzing audio signals in network-based audio analytics systems to enhance the accuracy, speed, and reliability of data analysis. Methods. The proposed approach uses a modified ResNet architecture for multi-event classification and a convolutional neural network for separating sound sources in multi-channel recordings. Results. The method achieves competitive results, comparable to baseline results in contemporary challenges like DCASE and demonstrates robust performance in noisy environments. Conclusions. The proposed method shows potential for improving the accuracy and reliability of audio event recognition in real-world scenarios, particularly in complex acoustic environments.
Article Details
References
Mesaros, A., Heittola, T., Virtanen, T. and Plumbley, M. D. (2021), “Sound Event Detection: A tutorial”, IEEE Signal Processing Magazine, vol. 38, no. 5, pp. 67–83, doi: https://doi.org/10.1109/MSP.2021.3090678
Xin, Y., Yang, D. and Zou, Y. (2023), “Background-aware Modeling for Weakly Supervised Sound Event Detection”, Proc. INTERSPEECH 2023, pp. 1199–1203, doi: https://doi.org/10.21437/Interspeech.2023-330
Barkovska, O. and Havrashenko, A. (2023), “Analysis of the influence of selected audio pre-processing stages on accuracy of speaker language recognition”, Innovative technologies and scientific solutions for industries, vol. 4 (26), pp. 16–23, doi: https://doi.org/10.30837/ITSSI.2023.26.016
Poroshenko, A. (2022), “Mathematical model of the passage of audio signals in network-based audio analytics systems”, Advanced Information Systems, vol. 6, no. 4, pp. 25–29, doi: https://doi.org/10.20998/2522-9052.2022.4.04
Poroshenko, A. and Kovalenko, A. (2023), “Audio signal transmission method in network-based audio analytics system”, Innovative technologies and scientific solutions for industries, vol. 4 (26), pp. 58–67, doi: https://doi.org/10.30837/ITSSI.2023.26.058
Sharifani, K. and Amini, M. (2023), “Machine Learning and Deep Learning: A Review of Methods and Applications”, World Information Technology and Engineering Journal, vol. 10, is. 07, pp. 3897–3904, available at: https://ssrn.com/abstract=4458723
Grumiaux, P.-A., Kitić, S., Girin, L. and Guérin, A. (2022), “A survey of sound source localization with deep learning methods”, J. Acoust. Soc. Am., vol. 152 (1), pp. 107–151, doi: https://doi.org/10.1121/10.0011809
Zaman, K., Sah, M., Direkoglu, C. and Unoki, M. (2023), “A Survey of Audio Classification Using Deep Learning”, IEEE Access, vol. 11, pp. 106.620–106.649, doi: https://doi.org/10.1109/ACCESS.2023.3318015
Fonseca, E., Favory, X., Pons, J., Font, F. and Serra, X. (2021), “FSD50K: An Open Dataset of Human-Labeled Sound Events”, IEEE/ACM Trans. on Audio, Speech, and Language Proc., vol. 30, pp. 829–852, doi: https://doi.org/10.1109/TASLP.2021.3133208
Piczak, K. J. (2015), “ESC: Dataset for Environmental Sound Classification”, Proceedings of the 23rd ACM International Conference on Multimedia, MM'15, Association for Computing Machinery, New York, NY, USA, pp. 1015–1018, doi: https://doi.org/10.1145/2733373.2806390
Foster, P., Sigtia, S., Krstulovic, S., Barker, J. and Plumbley, M. D. (2015), “Chime-home: A dataset for sound source recognition in a domestic environment”, 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA, New Paltz, NY, USA, 2015, pp. 1–5, doi: https://doi.org/10.1109/WASPAA.2015.7336899
Kavalerov, I., Wisdom, S., Erdogan, H., Patton, B., Wilson, K., Le Roux J. and Hershey J. R. (2019), “Universal Sound Separation”, 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA, New Paltz, NY, USA, pp. 175–179, doi: https://doi.org/10.1109/WASPAA.2019.8937253
Tzinis, E., Wisdom, S., Hershey, J. R., Jansen, A. and Ellis, D. P. W. (2020), “Improving Universal Sound Separation Using Sound Classification”, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP, Barcelona, Spain, pp. 96–100, doi: https://doi.org/10.1109/ICASSP40776.2020.9053921
(2023), “Sound Event Localization and Detection Evaluated in Real Spatial Sound Scenes”, DCASE Challenge 2023, available at: https://dcase.community/challenge2023/task-sound-event-localization-and-detection-evaluated-in-real-spatial-sound-scenes
Kovalenko, A. and Poroshenko, A. (2022), “Analysis of the sound event detection methods and systems”, Advanced Information Systems, vol. 6, no. 1, pp. 65–69. https://doi.org/10.20998/2522-9052.2022.1.11