AUDIO EVENT ANALYSIS METHOD IN NETWORK-BASED AUDIO ANALYTICS SYSTEMS

Main Article Content

Anton Poroshenko
Andriy Kovalenko

Abstract

Relevance. In the rapidly evolving field of network-based audio analytics systems, the detection and analysis of audio events play a crucial role across various applications, including security, healthcare, and entertainment. Subject. This paper examines a method for recognizing audio events in network-based audio analytics systems, including preprocessing, sound separation, and the creation of machine learning models for analyzing audio signals. Objective. The objective is to develop and improve integrated methods for analyzing audio signals in network-based audio analytics systems to enhance the accuracy, speed, and reliability of data analysis. Methods. The proposed approach uses a modified ResNet architecture for multi-event classification and a convolutional neural network for separating sound sources in multi-channel recordings. Results. The method achieves competitive results, comparable to baseline results in contemporary challenges like DCASE and demonstrates robust performance in noisy environments. Conclusions. The proposed method shows potential for improving the accuracy and reliability of audio event recognition in real-world scenarios, particularly in complex acoustic environments.

Article Details

How to Cite
Poroshenko , A. ., & Kovalenko , A. . (2024). AUDIO EVENT ANALYSIS METHOD IN NETWORK-BASED AUDIO ANALYTICS SYSTEMS . Advanced Information Systems, 8(4), 60–64. https://doi.org/10.20998/2522-9052.2024.4.08
Section
Information systems research
Author Biographies

Anton Poroshenko , Kharkiv National University of Radio Electronics, Kharkiv

Ph.D student at Department of Electronic Computers

Andriy Kovalenko , Kharkiv National University of Radio Electronics, Kharkiv

Doctor of Technical Sciences,Professor, Head of the Department of Electronic Computers

References

Mesaros, A., Heittola, T., Virtanen, T. and Plumbley, M. D. (2021), “Sound Event Detection: A tutorial”, IEEE Signal Processing Magazine, vol. 38, no. 5, pp. 67–83, doi: https://doi.org/10.1109/MSP.2021.3090678

Xin, Y., Yang, D. and Zou, Y. (2023), “Background-aware Modeling for Weakly Supervised Sound Event Detection”, Proc. INTERSPEECH 2023, pp. 1199–1203, doi: https://doi.org/10.21437/Interspeech.2023-330

Barkovska, O. and Havrashenko, A. (2023), “Analysis of the influence of selected audio pre-processing stages on accuracy of speaker language recognition”, Innovative technologies and scientific solutions for industries, vol. 4 (26), pp. 16–23, doi: https://doi.org/10.30837/ITSSI.2023.26.016

Poroshenko, A. (2022), “Mathematical model of the passage of audio signals in network-based audio analytics systems”, Advanced Information Systems, vol. 6, no. 4, pp. 25–29, doi: https://doi.org/10.20998/2522-9052.2022.4.04

Poroshenko, A. and Kovalenko, A. (2023), “Audio signal transmission method in network-based audio analytics system”, Innovative technologies and scientific solutions for industries, vol. 4 (26), pp. 58–67, doi: https://doi.org/10.30837/ITSSI.2023.26.058

Sharifani, K. and Amini, M. (2023), “Machine Learning and Deep Learning: A Review of Methods and Applications”, World Information Technology and Engineering Journal, vol. 10, is. 07, pp. 3897–3904, available at: https://ssrn.com/abstract=4458723

Grumiaux, P.-A., Kitić, S., Girin, L. and Guérin, A. (2022), “A survey of sound source localization with deep learning methods”, J. Acoust. Soc. Am., vol. 152 (1), pp. 107–151, doi: https://doi.org/10.1121/10.0011809

Zaman, K., Sah, M., Direkoglu, C. and Unoki, M. (2023), “A Survey of Audio Classification Using Deep Learning”, IEEE Access, vol. 11, pp. 106.620–106.649, doi: https://doi.org/10.1109/ACCESS.2023.3318015

Fonseca, E., Favory, X., Pons, J., Font, F. and Serra, X. (2021), “FSD50K: An Open Dataset of Human-Labeled Sound Events”, IEEE/ACM Trans. on Audio, Speech, and Language Proc., vol. 30, pp. 829–852, doi: https://doi.org/10.1109/TASLP.2021.3133208

Piczak, K. J. (2015), “ESC: Dataset for Environmental Sound Classification”, Proceedings of the 23rd ACM International Conference on Multimedia, MM'15, Association for Computing Machinery, New York, NY, USA, pp. 1015–1018, doi: https://doi.org/10.1145/2733373.2806390

Foster, P., Sigtia, S., Krstulovic, S., Barker, J. and Plumbley, M. D. (2015), “Chime-home: A dataset for sound source recognition in a domestic environment”, 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA, New Paltz, NY, USA, 2015, pp. 1–5, doi: https://doi.org/10.1109/WASPAA.2015.7336899

Kavalerov, I., Wisdom, S., Erdogan, H., Patton, B., Wilson, K., Le Roux J. and Hershey J. R. (2019), “Universal Sound Separation”, 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA, New Paltz, NY, USA, pp. 175–179, doi: https://doi.org/10.1109/WASPAA.2019.8937253

Tzinis, E., Wisdom, S., Hershey, J. R., Jansen, A. and Ellis, D. P. W. (2020), “Improving Universal Sound Separation Using Sound Classification”, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP, Barcelona, Spain, pp. 96–100, doi: https://doi.org/10.1109/ICASSP40776.2020.9053921

(2023), “Sound Event Localization and Detection Evaluated in Real Spatial Sound Scenes”, DCASE Challenge 2023, available at: https://dcase.community/challenge2023/task-sound-event-localization-and-detection-evaluated-in-real-spatial-sound-scenes

Kovalenko, A. and Poroshenko, A. (2022), “Analysis of the sound event detection methods and systems”, Advanced Information Systems, vol. 6, no. 1, pp. 65–69. https://doi.org/10.20998/2522-9052.2022.1.11