CONTEXT-ADAPTIVE METHOD FOR OBJECT DETECTION IN VIDEO STREAMS
Main Article Content
Abstract
The work is devoted to the development of a context-adaptive method for object detection in video streams that dynamically responds to environmental conditions. The relevance of the topic is explained by the need to increase the reliability of assistive systems for visually impaired people and other real-world applications, where variable weather and lighting conditions significantly reduce detection accuracy. The subject of the article is the study of multimodal fusion of acoustic, video, and LiDAR data for object recognition tasks. The goal of this paper is to propose and experimentally validate a method of adaptive preprocessing activation triggered by acoustic artifact classification. The task of this work is to analyze state-of-the-art preprocessing approaches (derain, defog, low-light enhancement), select appropriate acoustic classification models (e.g., PANNs, YAMNet), integrate LiDAR for spatial complementarity, and evaluate the impact of different preprocessing chains on detection metrics. Methods such as comparative analysis, experimental benchmarking of YOLO and DETR models, acoustic signal classification, and multimodal data fusion were applied. The results of the work include a confirmed increase in accuracy (mAP, Precision, Recall, IoU) and stability of detection under adverse conditions when using adaptive preprocessing pipelines, with YOLOv9m and YOLOv10m models showing the most balanced performance. Further research will focus on extending the model with full LiDAR integration, optimizing computational efficiency for mobile/embedded platforms, and scaling the approach for broader classes of environmental challenges such as fog, snow, and urban noise.
Article Details
References
Yao, Y., Shi, Z., Hu, H., Li, J., Wang, G.; and Liu, L. (2023), “GSDerainNet: A Deep Network Architecture Based on a Gaussian Shannon Filter for Single Image Deraining”, Remote Sens, 2023, vol. 15, doi: https://doi.org/10.3390/rs15194825
Pourali, A., Boukani, A. and Khazaei, H. (2025), “PreNeT: Leveraging Computational Features to Predict Deep Neural Network Training Time”, Proceedings of the 16th ACM/SPEC International Conference on Performance Engineering,
pp. 81–91, doi: https://doi.org/10.1145/3676151.3719373
Yu, Yi, Yang, W., Tan, Y.–P. and Kot, A. C. (2022), “Towards Robust Rain Removal Against Adversarial Attacks: A Comprehensive Benchmark Analysis and Beyond”, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, pp. 6003–6012, doi: https://doi.org/10.1109/CVPR52688.2022.00592
Guo, Q., Sun, J., Juefei-Xu, F., Ma, L., Xie, X., Feng, W., Liu, Y., and Zhao, J. (2021), “Efficient De Rain: Learning Pixel-Wise Dilation Filtering for High-Efficiency Single-Image Deraining”, Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 2, pp. 1487–1495, doi: https://doi.org/10.1609/aaai.v35i2.16239
Mu, W., Liu, H., Chen, W. and Wang, Y. (2022), “A More Effective Zero-DCE Variant: Zero-DCE Tiny”, Electronics, vol. 11, no. 17, doi: https://doi.org/10.3390/electronics1117275 0
Zhang, S., Zhao, S., An, D., Li, D. and Zhao R. (2024), “LiteEnhanceNet: A Lightweight Network for Real-Time Single Underwater Image Enhancement”, Expert Systems with Applications, vol. 240, Apr. 2024, article number 122546, doi: https://doi.org/10.1016/j.eswa.2023.122546
Jiang, Y., Gong, X., Liu, D., Cheng, Yu, Fang, C. and Shen X. (2021), “EnlightenGAN: Deep Light Enhancement Without Paired Supervision, IEEE Trans. on Image Proc., vol. 30, pp. 2340–2349, doi: https://doi.org/10.1109/TIP.2021.3051462
Ullah, H., Muhammad, K., Irfan, M., Anwar, S., Sajjad, M. and Imran, A. S. (2021), “Light-DehazeNet: A Novel Lightweight CNN Architecture for Single Image Dehazing”, IEEE Transactions on Image Processing, vol. 30, pp. 8968–8982, doi: https://doi.org/10.1109/TIP.2021.3116790
Zhang, L., Zhao, J., Lang, Z. and Fang L. (2024), “Vehicle detection algorithm for foggy based on improved AOD-Net”, Transactions of the Institute of Measurement and Control, vol. 46, issue 14, pp. 2696–2705, doi: https://doi.org/10.1177/01423312241248490
Guo, Z., Zhang, X. and Yu, S. (2024),” Image Defogging Based on Improved AOD-Net Network “, Image Processing, Electronics and Computers, IOS Press, pp. 211–222, doi: https://doi.org/10.3233/ATDE240472
Liu, X., Shi, Z., Wu, Z., Chen, J. and Zhai, G. (2023), “GridDehazeNet+: An Enhanced Multi-Scale Network With Intra-Task Knowledge Transfer for Single Image Dehazing”, IEEE Transactions on Intelligent Transportation Systems, vol. 24, no. 1, pp. 870–884, doi: https://doi.org/10.1109/TITS.2022.3210455
Kholiev, V., and Barkovska, O. (2023), “Comparative analysis of neural network models for the problem of speaker recognition”, Innovative Technologies and Scientific Solutions for Industries, vol. 2 (24), pp. 172–178, doi: https://doi.org/10.30837/ITSSI.2023.24.172
Barkovska, O., Holovchenko, O., Storchai, D., Kostin, A., and Lehezin, N. (2025), “Investigation of computer vision techniques for indoor navigation systems”, Innovative Technologies and Scientific Solutions for Industries, vol. 2 (32), pp. 5–15, doi: https://doi.org/10.30837/2522-9818.2025.2.005
Tsalera, E., Papadakis, A. and Samarakou, M. (2021), “Comparison of Pre-Trained CNNs for Audio Classification Using Transfer Learning”, Journal of Sensor and Actuator Networks, vol. 10, no. 4, doi: https://doi.org/10.3390/jsan10040072
Arshdeep, S., Haohe, L. and Plumbley, M. D. (2023), “E-PANNs: Sound Recognition Using Efficient Pre-Trained Audio Neural Networks”, INTER-NOISE and NOISE-CON Congress and Conference Proceedings, vol. 268, no. 1, pp. 7220–7228, doi: https://doi.org/10.3397/IN_2023_1083
Valliappan, N. Harihara, Pande, S. D. and Vinta, S. R. (2024), “Enhancing Gun Detection with Transfer Learning and YAMNet Audio Classification”, IEEE Access, vol. 12, pp. 58940–58949, doi: https://doi.org/10.1109/ACCESS.2024.3392649
Turskis, T., Teleiša, M., Buckiūnaitė, R. and Čalnerytė, D. (2023), “Mixed-type data augmentations for environmental sound classification”, IVUS 2023: Information society and university studies 2023, CEUR workshop proc. of the 28th int. conf. on information society and university studies (IVUS 2023), Kaunas, Lithuania, May 12, 2023, CEUR-WS, 3575, pp. 184–194, available at: https://ceur-ws.org/Vol-3575/Paper20.pdf
Barkovska, O. and Serdechnyi, V. (2024), “Intelligent Assistance System for People with Visual Impairments”, Innovative Technologies and Scientific Solutions for Industries, vol. 2(28), pp. 6–16, doi: https://doi.org/10.30837/2522-9818.2024.28.006