A COMPARISON OF CLASSIFIERS APPLIED TO THE PROBLEM OF BIOPSY IMAGES ANALYSIS
Main Article Content
Abstract
The purpose of the research is to compare classification algorithms for the histopathological images analyzing issue and to optimize the parameters for obtaining better classification accuracy. The following tasks are solved in the article: preprocessing of BreCaHAD dataset images, implementation and training of CNN, applying K-nearest neighbours, SVM, Random Forest, XGBoost, and perceptron algorithms for classifying features that were extracted by CNN, and results comparison. The object of the research is the process of classifying tumor cells in the microscopic biopsy images. The subject of the research is the process of using ML algorithms for classification of the features extracted by CNN from input biopsy image. The scientific novelty of the research is a comparative analysis of classifiers on the task of “tumor” and “healthy” cells images classification from processed BreCaHAD dataset. As a result it was obtained that from chosen classifiers SVM reached the highest accuracy on test data – 0.972. This is the only algorithm that shows better accuracy than perceptron. Perceptron gets 0.966 classification accuracy. K-nearest neighbours, Random Forest, and XGBoost algorithms reached lower results. The algorithms' hyperparameters optimization was carried out. The results have been compared with related works. The following research methods are used: the theory of deep learning, mathematical statistics, parameters optimization.
Article Details
References
Lee, J.-G., Jun, S., Cho, Y.-W., Lee, H., Kim, G. B., Seo, J. B. and Kim, N. (2017), “Deep learning in medical imaging: general overview”, Korean journal of radiology, 18(4), pp. 570-584.
Greenspan, H., Van Ginneken, B. and Summers, R. M. (2016), “Guest editorial deep learning in medical imaging: Overview and future promise of an exciting new technique”, IEEE Transactions on Medical Imaging, 35(5), pp. 1153-1159.
Laak, J. A. (2017), “A survey on deep learning in medical image analysis”, Medical image analysis, 42, pp. 60-88.
Cancer tomorrow (2020), available at: https://gco.iarc.fr/tomorrow/home
International Agency for Research of Cancer, World Health Organization Europe: Ukraine – Global Cancer Observatory (2018), available at: https://gco.iarc.fr/today/data/factsheets/populations/804-ukraine-fact-sheets.pdf
Kourou, K., Exarchos, T. P., Exarchos, K. P., Karamouzis, M. V. and Fotiadis, D. I. (2015), “Machine learning applications in cancer prognosis and prediction”, Computational and structural biotechnology journal, 13, pp. 8-17.
Janowczyk, A. and Madabhushi, A. (2016), “Deep learning for digital pathology image analysis: A comprehensive tutorial with selected use cases”, Journal of pathology informatics, 7.
Cunningham, P. and Delany, S. J. (2020), k-Nearest Neighbour Classifiers, arXiv:2004.04523.
Leithardt, V. R. Q. (2020), “PRIPRO: A Comparison of Classification Algorithms for Managing Receiving Notifications in Smart Environments”, Applied Sciences, 10(2), p. 502.
Andrić, I., Pina, A., Ferrão, P., Fournier, J., Lacarrière, B. and Le Corre, O. (2017), “Assessing the feasibility of using the heat demand-outdoor temperature function for a long-term district heat demand forecast”, Energy Procedia, 116, pp. 460-469.
Pontil, M. and Verri, A. (1998), “Support vector machines for 3D object recognition”, IEEE transactions on pattern analysis and machine intelligence, 20(6), pp. 637-646.
Muralidharan, R. and Chandrasekar, C. (2011), “Object recognition using support vector machine augmented by RST invariants”, International Journal of Computer Science Issues (IJCSI), 8(5), p. 280.
Bishop, C. M. (2006), Pattern recognition and machine learning, Springer.
Nabipour, M., Nayyeri, P., Jabani, H. and Mosavi, A. (2020), Deep learning for Stock Market Prediction, arXiv:2004.01497.
Skiena, S. S. (2017), The data science design manual, Springer.
Breiman, L. (2001), “Random forests”, Machine learning, 45(1), pp. 5-32.
RFC (2020), available at: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html
Chollet, F. (2018), Deep Learning mit Python und Keras, MITP-Verlags GmbH & Co. KG.
Yaloveha, V., Hlavcheva, D., Podorozhniak, A. and Kuchuk, H. (2019), “Fire hazard research of forest areas based on the use of convolutional and capsule neural networks”, 2019 IEEE 2nd Ukraine Conference on Electrical and Computer Engineering (UKRCON), IEEE, pp. 828-832, DOI: https://doi.org/10.1109/UKRCON.2019.8879867
Kuchuk, H., Podorozhniak, A., Hlavcheva, D. and Yaloveha, V. (2020), “Application of Deep Learning in the Processing of the Aerospace System's Multispectral Images”, Handbook of Research on Artificial Intelligence Applications in the Aviation and Aerospace Industries, IGI Global, pp. 134-147, DOI: https://doi.org/10.4018/978-1-7998-1415-3.ch005
Deng, F., Pu, S., Chen, X., Shi, Y., Yuan, T. and Pu, S. (2018), “Hyperspectral image classification with capsule network using limited training samples”, Sensors, 18(9), pp. 3153.
LeCun, Y., Bengio, Y. and Hinton, G. (2015), “Deep learning”, Nature, 521(7553), pp. 436-444.
Google Colab (2020), available at: colab.research.google.com
Aksac, A., Demetrick, D. J., Ozyer, T. and Alhajj, R. (2019), “BreCaHAD: a dataset for breast cancer histopathological annotation and diagnosis”, BMC research notes, 12(1), pp. 1-3.
Kim, J. (1997), Iterated grid search algorithm on unimodal criteria, Virginia Tech.
Scikit learn (2020), available at: https://scikit-learn.org/stable/index.html
Agarap, A. F. (2017), “An architecture combining convolutional neural network (CNN) and support vector machine (SVM) for image classification”, arXiv preprint arXiv:1712.03541.
Hlavcheva, D., Yaloveha, V. and Podorozhniak, A. (2019), “Application of convolutional neural network for histopathological analysis”, Advanced Information Systems, Vol. 3, No. 4, pp. 69-73, DOI: https://doi.org/10.20998/2522-9052.2019.4.10
de Souza, B. F., de Carvalho, A. C. and Soares, C. (2010), “A comprehensive comparison of ml algorithms for gene expression data classification”, The 2010 International Joint Conference on Neural Networks (IJCNN): IEEE, pp. 1-8.
Dierks, T. (2007), “Application and comparison of classification algorithms for recognition of Alzheimer's disease in electrical brain activity (EEG)”, Journal of neuroscience methods, 161(2), pp. 342-350.
World Health Organization: Cancer (2020), available at: https://www.who.int/health-topics/cancer#tab=tab_1