ADVANCED METHODS FOR CLASSIFICATION QUALITY ASSESSMENT LEVERAGING ROC ANALYSIS AND MULTIDIMENSIONAL CONFUSION MATRIX
Main Article Content
Abstract
The object of the study is the process of classifying objects in scientific problems. The subject of the study is methods aimed at assessing the effectiveness of multiclass classification. The goal of the study is to study the classification process and develop a classifier evaluation module to increase the speed of such evaluation and reduce the time to build complex machine learning classifiers. Methods used: methods for evaluating machine learning classifiers, methods for constructing ROC curves, principles of parallel and distributed computing. Results obtained: an analytical review of the scope of application of the classification quality assessment module in the field of humanities, technical and economic sciences was conducted. Existing classification quality assessment metrics were considered and mathematical descriptions of metrics were formed for the multi-class case. Software was developed that implements the proposed mathematical descriptions using parallel calculations and optimization of identical operations. The developed module was tested for reliability. Conclusions. According to the results of the study, methods for effective classification quality assessment is proposed, which allows reducing the time for assessing the quality of multi-class classifiers by 40% compared to the classical methods. The development of this module opens up broad prospects for further research in the direction of improving the quality of classification, which will contribute to the development of various spheres of human activity and increase the efficiency of solving tasks related to data analysis.
Article Details
References
An, Q., Huang, S., Han, Y. and Zhu, Y. (2024), “Ensemble learning method for classification: Integrating data envelopment analysis with machine learning”, Computers and Operations Research, 169, doi: https://doi.org/10.1016/j.cor.2024.106739
Gavrylenko, S., Chelak, V. and Hornostal, O. (2021), “Ensemble Approach Based on Bagging and Boosting for Identification the Computer System State”, 2021 XXXI International Scientific Symposium Metrology and Metrology Assurance (MMA), Sozopol, Bulgaria, 2021, pp. 1–7, doi: https://doi.org/10.1109/MMA52675.2021.9610949
Li, J. (2024), “Area under the ROC Curve has the most consistent evaluation for binary classification”, PLoS ONE, vol. 19, is. 12, e0316019, doi: https://doi.org/10.1371/journal.pone.0316019
Phuong, L.B. and Zung, N.T. (2023), “Accuracy Measures and the Convexity of ROC Curves for Binary Classification Problems”, Studies in Computational Intelligence, 1045, pp. 155–163, doi: https://doi.org/10.1007/978-3-031-08580-2_15
Gavrylenko, S., Vladislav, Z. and Khatsko, N. (2023), “Methods For Improving The Quality Of Classification On Imbalanced Data”, 2023 IEEE 4th KhPI Week on Advanced Technology, KhPI Week 2023 – Conf. Proc., doi: https://doi.org/10.1109/KhPIWeek61412.2023.10312879
Petrovska, I., Kuchuk, H. and Mozhaiev, M. (2022), “Features of the distribution of computing resources in cloud systems”, 2022 IEEE 4th KhPI Week on Advanced Technology, KhPI Week 2022 - Conference Proceedings, 03-07 October 2022, Code 183771, doi: https://doi.org/10.1109/KhPIWeek57572.2022.9916459
Gavrylenko, S., Hornostal, O. and Chelak, V. (2022), “Research of Methods of Identifying the Computer Systems State based on Bagging Classifiers”, 2022 IEEE 3rd KhPI Week on Advanced Technology (KhPIWeek), Kharkiv, Ukraine, 2022, pp. 1–6, doi: https://doi.org/10.1109/KhPIWeek57572.2022.9916439
Brenych, Y. (2011), “Classification methods using Winners-Take-All neural networks”, Perspective Technologies and Methods in MEMS Design, Polyana, Ukraine, pp. 234–236, available at: https://ieeexplore.ieee.org/document/5960381
Gavrylenko, S., Chelak, V., Hornostal, O. and Gornostal, S. (2019), “Identification of the Computer System State Based on Multidimensional Discriminant Analysis”, 2019 XXIX International Scientific Symposium "Metrology and Metrology Assurance" (MMA), Sozopol, Bulgaria, 2019, pp. 1–4, doi: https://doi.org/10.1109/MMA.2019.8936011
Markatopoulou, F., Mezaris, V., Pittaras, N. and Patras, I. (2015), “Local Features and a Two-Layer Stacking Architecture for Semantic Concept Detection in Video”, IEEE Transactions on Emerging Topics in Computing, vol. 3, no. 2, pp. 193–204, June 2015, doi: https://doi.org/10.1109/TETC.2015.2418714
Hayes, J. H., Li, W. and Rahimi, M. (2014), “Weka meets TraceLab: Toward convenient classification: Machine learning for requirements engineering problems: A position paper”, 2014 IEEE 1st International Workshop on Artificial Intelligence for Requirements Engineering (AIRE), Karlskrona, Sweden, pp. 9–12, doi: https://doi.org/10.1109/AIRE.2014.6894850
Olewnik, A. and Memarian, B. (2022), “Characterizing Student Engineering Problem Engagement Through Process Diagramming”, 2022 IEEE Frontiers in Education Conference (FIE), Uppsala, Sweden, pp. 1–5, doi: https://doi.org/10.1109/FIE56618.2022.9962445
Mottier, M., Chardon, G. and Pascal, F. (2022), “RADAR Emitter Classification with Optimal Transport Distances”, 2022 30th European Signal Processing Conference (EUSIPCO), Belgrade, Serbia, 2022, pp. 1871–1875, doi: https://doi.org/10.23919/EUSIPCO55093.2022.9909967
Zhang, R. (2022), “Classification and Management of Accounting Management Data Based on Big Data Technology”, 2022 IEEE 2nd International Conference on Mobile Networks and Wireless Communications (ICMNWC), Tumkur, Karnataka, India, 2022, pp. 1–5, doi: https://doi.org/10.1109/ICMNWC56175.2022.10031282
Ohrimenco, S., Borta, G. and Tetiana, B. (2019), “Shadow of Digital Economics”, 2019 IEEE International Scientific-Practical Conference Problems of Infocommunications, Science and Technology (PIC S&T), Kyiv, Ukraine, pp. 776–780, doi: https://doi.org/10.1109/PICST47496.2019.9061545
Sztojanov, V. V. and Popescu-Mina, C. (2007), “Image Processing for Classification in Biology Systems”, 2007 2nd International Workshop on Soft Computing Applications, pp. 33–38, doi: https://doi.org/10.1109/SOFA.2007.4318301
Hornostal, O. and Gavrylenko S. (2023), “Application of heterogeneous ensembles in problems of computer system state identification”, Advanced Information Systems, vol. 7, no. 4, pp. 5–12, doi: https://doi.org/10.20998/2522-9052.2023.4.01
Mathew, R. M. and Gunasundari, R. (2021), “A Review on Handling Multiclass Imbalanced Data Classification In Education Domain”, 2021 International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE), Greater Noida, India, 2021, pp. 752–755, doi: https://doi.org/10.1109/ICACITE51222.2021.9404626
Gavrylenko, S., Chelak, V. and Hornostal, O. (2020), “Research of Intelligent Data Analysis Methods for Identification of Computer System State”, 30th International Scientific Symposium Metrology and Metrology Assurance, MMA 2020, 9254252, doi: https://doi.org/10.1109/MMA49863.2020.9254252
Antoni, L., Cornejo, M. E., Medina, J. and Ramírez-Poussa, E. (2021), “Attribute Classification and Reduct Computation in Multi-Adjoint Concept Lattices”, IEEE Transactions on Fuzzy Systems, vol. 29, no. 5, pp. 1121–1132, May 2021, doi: https://doi.org/10.1109/TFUZZ.2020.2969114