THE ENSEMBLE METHOD DEVELOPMENT OF CLASSIFICATION OF THE COMPUTER SYSTEM STATE BASED ON DECISIONS TREES

Main Article Content

Svitlana Gavrylenko
http://orcid.org/0000-0002-6919-0055
Illia Sheverdin
http://orcid.org/0000-0002-7881-0658
Michael Kazarinov
http://orcid.org/0000-0002-0790-5262

Abstract

The subject of this article is exploration of methods for identifying the status of a computer system. The purpose of the article is development of a method for classifying a computer system anomalous state based on ensemble methods. Task: To investigate the usage of algorithms for building decision trees: REPTree, Random Tree, J48, HoeffdingTree, DecisionStump and bagging and boosting decision tree ensembles to identify a computer system anomalous state by analyzing operating system events. The methods used are artificial intelligence, machine learning and ensemble classification methods. The following results were obtained: the methods of identifying the computer systems anomalous state based on ensemble methods were investigated, namely, bagging, boosting, and classifiers: REPTree, Random Tree, J48, HoeffdingTree, DecisionStump to identify a computer system anomalous state. The different classifiers set and classifiers ensembles were developed. Training and cross-validation on each algorithm was performed. The developed classifiers performance has been evaluated. The research suggests an ensemble method of a computer system state classifying based on the J48 decision tree algorithm. Conclusions. The scientific novelty of the obtained results consists in creating an ensemble method for classifying the state of a computer system based on a decision tree, which makes it possible to increase the reliability and speed of classification.

Article Details

Section
Identification problems in information systems
Author Biographies

Svitlana Gavrylenko, National Technical University «Kharkiv Polytechnic Institute», Kharkiv

Doctor of Technical Sciences, Associate Professor, Professor of Department of "Computer Engineering and Programming"

Illia Sheverdin, National Technical University «Kharkiv Polytechnic Institute», Kharkiv

PhD Student of Computer Engineering and Programming Department 

Michael Kazarinov, Northeastern Illinois University, Chicago, IL

teacher, Computer Science Department

References

Korchenko, A.A. (2012), “Sistema vyyavleniya anomalnogo sostoyaniya v kompyuternykh setyakh”, Bezpeka informacziyi, Kyiv, Vol. 2 (18), pp. 80-84.

Chowdhury, M. (2017), “Malware Analysis and Detection Using Data Mining and Machine Learning Classification”, International Conference on Applications and Techniques in Cyber Security and Intelligence, ATCI, pp. 266-274.

Gavrilenko, S.Yu. (2019), “Metodika vidboru sistemi pokaznikiv dlya identifikacziyi stanu komp’yuternoyi sistemi kritichnogo zastosuvannya”, Radioelektronni i komp’yuterni sistemi, Vol. 2 (90), pp. 127-135, DOI:

https://doi.org/10.32620.reks.2019.2.12

Krivenko, M.P. and Vasilev, V.G. (2013), Metody klassifikaczii dannykh bolshoj razmernosti, IPI RAN, Moscow, 204 p.

Bargesyan, A.A. (2007), Tekhnologii analiza dannykh: Data Mining, Visual Mining, Text Mining, OLAP, BKhV-Peterburg, Sankt-Peterburg, 384 p.

Vipin, Kumar (2009), The Top Ten Algorithms in DataMining, Taylor & Francis Group, LLC, 2006 p.

Satton, Richard and Barto, Endryu G. (2020), Obuchenie s podkrepleniem = Reinforcement Learning, DMK press, Moscow, 552 p.

Kaftannikov I.L. and Parasich, A.V. (2015), “Osobennosti primeneniya derevev reshenij v zadachakh klassifikaczii”, Vestn. YuUrGU. Ser. «Kompyuternye tekhnologii, upravlenie, radioelektronika», Vol. 15, No. 3, pp. 26-32.

Cha, Zhang (2012), Ensemble Machine Learning. Methods and Applications, Springer, London, 329 p.

Tarkhov, D.A. (2014), Nejrosetevye modeli i algoritmy, Radiotekhnika, Moscow, 352 p.

Vyugin, V.V. (2013), Matematicheskie osnovy mashinnogo obucheniya i prognozirovaniya, MCzNMO, Moscow, 304 p.

Marchenko, O.O. and Rossada, T.V. (2017), Aktualni problemi Data Mining, Kyiv, 150 p.

Bolshakov, A.S. and Gubankova, E.V. (2020), “Obnaruzhenie anomalij v kompyuternykh setyakh s ispolzovaniem metodov mashinnogo obucheniya”, Telekommunikaczionnye ustrojstva i sistemy, Vol. 20 (1), pp. 37-42.

Joseph, Rocca and Baptiste, Rocca (2020), “Ensemble methods: bagging, boosting and stacking”, available at: https://towardsdatascience.com/ensemble-methods-bagging-boosting-and-stacking-c9214a10a205.

Bauer, E. (1999), “An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants”, Machine Learning, Springer Link, pp. 105–139.

Kristína, Machová, Miroslav, Puszta, František, Barčák, and Peter, Bednár (2006), “A Comparison of the Bagging and the Boosting Methods Using the Decision Trees Classifiers”, Computer Science and Information Systems, Vol. 3(2), pp.57-72.

(2020), Metody postroeniya derevev reshenij v zadachakh klassifikaczii v Data Mining, available at:

https://ami.nstu.ru/~vms/lecture/data_mining/trees.htm.

Mitchell, N. and Tom, Michael (1997), Machine Learning, McGraw-Hill, New York, 432 p.

Iba, Wayne and Langley, Pat. (1992), “Induction of One-Level Decision Trees”, ML92 – Proceedings of the Ninth International Conference on Machine Learning, Morgan Kaufmann, San Francisco, pp. 233-240.

Zontul, M., Aydin, F., Dogan, G., Sener, S and Kaynar, O. (2013), “Wind speed forecasting using REPtree and bagging methods in kirklareli-turkey”, Journal of Theoretical and Applied Information Technology, Vol. 56(1), pp.17-29.

(2020), Class HoeffdingTree, available at: https://weka.sourceforge.io/doc.dev/weka/classifiers/trees/HoeffdingTree.html.

(2020), Class RandomTree. available at: http://weka.sourceforge.net/doc.dev/weka/classifiers/trees/RandomTree.html.

Krivenko, M.P. and Vasilev V.G. (2013), Metody klassifikac dannykh bolshoj razmernosti, IPI RAN, 2013, Moskow, 204 p.

Myuller, A. (2017), Vvedenie v mashinnoe obuchenie s pomoshhyu Python. Rukovodstvo dlya speczialistov po rabote s dannymi, Alfa-kniga, Moskow, 480 p.

(2020), The workbench for machine learning, available at: https://www.cs.waikato.ac. nz/ml/weka