A COMPARISON OF CLASSIFIERS APPLIED TO THE PROBLEM OF BIOPSY IMAGES ANALYSIS

Main Article Content

Daria Hlavcheva
https://orcid.org/0000-0001-6990-6845
Vladyslav Yaloveha
https://orcid.org/0000-0001-7109-9405
Andrii Podorozhniak
https://orcid.org/0000-0002-6688-8407
Nataliia Lukova-Chuiko
https://orcid.org/0000-0003-3224-4061

Abstract

The purpose of the research is to compare classification algorithms for the histopathological images analyzing issue and to optimize the parameters for obtaining better classification accuracy. The following tasks are solved in the article: preprocessing of BreCaHAD dataset images, implementation and training of CNN, applying K-nearest neighbours, SVM, Random Forest, XGBoost, and perceptron algorithms for classifying features that were extracted by CNN, and results comparison. The object of the research is the process of classifying tumor cells in the microscopic biopsy images. The subject of the research is the process of using ML algorithms for classification of the features extracted by CNN from input biopsy image. The scientific novelty of the research is a comparative analysis of classifiers on the task of “tumor” and “healthy” cells images classification from processed BreCaHAD dataset. As a result it was obtained that from chosen classifiers SVM reached the highest accuracy on test data – 0.972. This is the only algorithm that shows better accuracy than perceptron. Perceptron gets 0.966 classification accuracy. K-nearest neighbours, Random Forest, and XGBoost algorithms reached lower results. The algorithms' hyperparameters optimization was carried out. The results have been compared with related works. The following research methods are used: the theory of deep learning, mathematical statistics, parameters optimization.

Article Details

How to Cite
Hlavcheva, D., Yaloveha, V., Podorozhniak, A., & Lukova-Chuiko, N. (2020). A COMPARISON OF CLASSIFIERS APPLIED TO THE PROBLEM OF BIOPSY IMAGES ANALYSIS. Advanced Information Systems, 4(2), 12–16. https://doi.org/10.20998/2522-9052.2020.2.03
Section
Identification problems in information systems
Author Biographies

Daria Hlavcheva, National Technical University "Kharkiv Polytechnic Institute", Kharkiv

student of Computer Science and Programming Department

Vladyslav Yaloveha, National Technical University "Kharkiv Polytechnic Institute", Kharkiv

Assistant Lecturer of Computer Science and Programming Department

Andrii Podorozhniak, National Technical University "Kharkiv Polytechnic Institute", Kharkiv

Candidate of Technical Sciences, Associate Professor, Associate Professor of Computer Science and Programming Department

Nataliia Lukova-Chuiko, Taras Shevchenko National University of Kyiv, Kyiv

Doctor of Technical Sciences, Associate Professor, Associate Professor of Cyber Security and Information Protection Department

References

Lee, J.-G., Jun, S., Cho, Y.-W., Lee, H., Kim, G. B., Seo, J. B. and Kim, N. (2017), “Deep learning in medical imaging: general overview”, Korean journal of radiology, 18(4), pp. 570-584.

Greenspan, H., Van Ginneken, B. and Summers, R. M. (2016), “Guest editorial deep learning in medical imaging: Overview and future promise of an exciting new technique”, IEEE Transactions on Medical Imaging, 35(5), pp. 1153-1159.

Laak, J. A. (2017), “A survey on deep learning in medical image analysis”, Medical image analysis, 42, pp. 60-88.

Cancer tomorrow (2020), available at: https://gco.iarc.fr/tomorrow/home

International Agency for Research of Cancer, World Health Organization Europe: Ukraine – Global Cancer Observatory (2018), available at: https://gco.iarc.fr/today/data/factsheets/populations/804-ukraine-fact-sheets.pdf

Kourou, K., Exarchos, T. P., Exarchos, K. P., Karamouzis, M. V. and Fotiadis, D. I. (2015), “Machine learning applications in cancer prognosis and prediction”, Computational and structural biotechnology journal, 13, pp. 8-17.

Janowczyk, A. and Madabhushi, A. (2016), “Deep learning for digital pathology image analysis: A comprehensive tutorial with selected use cases”, Journal of pathology informatics, 7.

Cunningham, P. and Delany, S. J. (2020), k-Nearest Neighbour Classifiers, arXiv:2004.04523.

Leithardt, V. R. Q. (2020), “PRIPRO: A Comparison of Classification Algorithms for Managing Receiving Notifications in Smart Environments”, Applied Sciences, 10(2), p. 502.

Andrić, I., Pina, A., Ferrão, P., Fournier, J., Lacarrière, B. and Le Corre, O. (2017), “Assessing the feasibility of using the heat demand-outdoor temperature function for a long-term district heat demand forecast”, Energy Procedia, 116, pp. 460-469.

Pontil, M. and Verri, A. (1998), “Support vector machines for 3D object recognition”, IEEE transactions on pattern analysis and machine intelligence, 20(6), pp. 637-646.

Muralidharan, R. and Chandrasekar, C. (2011), “Object recognition using support vector machine augmented by RST invariants”, International Journal of Computer Science Issues (IJCSI), 8(5), p. 280.

Bishop, C. M. (2006), Pattern recognition and machine learning, Springer.

Nabipour, M., Nayyeri, P., Jabani, H. and Mosavi, A. (2020), Deep learning for Stock Market Prediction, arXiv:2004.01497.

Skiena, S. S. (2017), The data science design manual, Springer.

Breiman, L. (2001), “Random forests”, Machine learning, 45(1), pp. 5-32.

RFC (2020), available at: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html

Chollet, F. (2018), Deep Learning mit Python und Keras, MITP-Verlags GmbH & Co. KG.

Yaloveha, V., Hlavcheva, D., Podorozhniak, A. and Kuchuk, H. (2019), “Fire hazard research of forest areas based on the use of convolutional and capsule neural networks”, 2019 IEEE 2nd Ukraine Conference on Electrical and Computer Engineering (UKRCON), IEEE, pp. 828-832, DOI: https://doi.org/10.1109/UKRCON.2019.8879867

Kuchuk, H., Podorozhniak, A., Hlavcheva, D. and Yaloveha, V. (2020), “Application of Deep Learning in the Processing of the Aerospace System's Multispectral Images”, Handbook of Research on Artificial Intelligence Applications in the Aviation and Aerospace Industries, IGI Global, pp. 134-147, DOI: https://doi.org/10.4018/978-1-7998-1415-3.ch005

Deng, F., Pu, S., Chen, X., Shi, Y., Yuan, T. and Pu, S. (2018), “Hyperspectral image classification with capsule network using limited training samples”, Sensors, 18(9), pp. 3153.

LeCun, Y., Bengio, Y. and Hinton, G. (2015), “Deep learning”, Nature, 521(7553), pp. 436-444.

Google Colab (2020), available at: colab.research.google.com

Aksac, A., Demetrick, D. J., Ozyer, T. and Alhajj, R. (2019), “BreCaHAD: a dataset for breast cancer histopathological annotation and diagnosis”, BMC research notes, 12(1), pp. 1-3.

Kim, J. (1997), Iterated grid search algorithm on unimodal criteria, Virginia Tech.

Scikit learn (2020), available at: https://scikit-learn.org/stable/index.html

Agarap, A. F. (2017), “An architecture combining convolutional neural network (CNN) and support vector machine (SVM) for image classification”, arXiv preprint arXiv:1712.03541.

Hlavcheva, D., Yaloveha, V. and Podorozhniak, A. (2019), “Application of convolutional neural network for histopathological analysis”, Advanced Information Systems, Vol. 3, No. 4, pp. 69-73, DOI: https://doi.org/10.20998/2522-9052.2019.4.10

de Souza, B. F., de Carvalho, A. C. and Soares, C. (2010), “A comprehensive comparison of ml algorithms for gene expression data classification”, The 2010 International Joint Conference on Neural Networks (IJCNN): IEEE, pp. 1-8.

Dierks, T. (2007), “Application and comparison of classification algorithms for recognition of Alzheimer's disease in electrical brain activity (EEG)”, Journal of neuroscience methods, 161(2), pp. 342-350.

World Health Organization: Cancer (2020), available at: https://www.who.int/health-topics/cancer#tab=tab_1