MEDOIDS AS A PACKING OF ORB IMAGE DESCRIPTORS
Main Article Content
Abstract
The aim of the research. The paper presents the research about the feasibility to use matching medoids obtained from the set of ORB descriptors instead matching the full set of binary descriptors for image classification problem. Research results. Different methods that include direct brute force medoids matching, grouping of medoids for separate classes, and grouping of descriptors followed by calculation of medoids amongst them were proposed. Numerical experiments were performed for all these methods in order to compare the classification accuracy and inference time. It has been shown that using of medoids allowed us to redistribute processing time in order to perform more calculations during preprocessing rather than during classification. According to modelling performed on the Leeds Butterly dataset matching images based on medoids could have the same accuracy as matching of descriptors (0.69–0.88 for different number of features). Medoids require additional time for the calculation during preprocessing stage but classification time becomes faster: in our experiments we have obtained about 9–10 times faster classification and same 9–10 times increasing preprocessing time for the models that have comparable accuracies. Finally, the efficiency of the proposed ideas was compared to the CNN trained and evaluated on the same data. As expected, CNN required much more preprocessing (training) time but the result is worth it: this approach provides the best classification accuracy and inference time. Conclusion. Medoids matching could have the same accuracy as direct descriptors matching, but the usage of medoids allows us to redistribute the overall modeling time with the increasing preprocessing time and making inference faster.
Article Details
References
Amato, G., Falchi, F. and Vadicamo, L. (2018), “Aggregating binary local descriptors for image retrieval”, Multimedia Tools and Applications, vol. 77, pp. 5385–5415, doi: https://doi.org/10.1007/s11042-017-4450-2
Jégou, H., Perronnin, F., Douze, M. Sánchez, J., Pérez, P. and Schmid, C. (2012), “Aggregating Local Image Descriptors into Compact Codes”, IEEE Transactions on Pattern Analysis and Machine Intelligence”, vol. 34(9), pp. 1704–1716, doi: https://doi.org/10.1109/TPAMI.2011.235
Jégou, H., Douze, M., Schmid, C. and Pérez, P. (2010), “Aggregating local descriptors into a compact image representation”, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, pp. 3304–3311, doi: https://doi.org/10.1109/CVPR.2010.5540039
Perronnin, F. and Dance, C. (2007), “Fisher kernels on visual vocabularies for image categorization,” Computer Vision and Pattern Recognition, pp. 1–8, doi: https://doi.org/10.1109/CVPR.2007.383266
Grana, C., Borghesani, D., Manfredi, M. and Cucchiara, R. (2013), “A fast approach for integrating ORB descriptors in the bag of words model”, Multimedia Content and Mobile Devices, vol. 8667, doi: https://doi.org/10.1117/12.2008460
Korytkowski, M., Scherer, R., Staszewski, P. and Woldan, P. (2015), “Bag-of-features image indexing and classification in microsoft SQL server relational database”, 2015 IEEE 2nd International Conference on Cybernetics (CYBCONF), Gdynia, Poland, 2015, pp. 478–482, doi: https://doi.org/10.1109/CYBConf.2015.7175981
Rani, R., Kumar, R. and Singh, A. P. (2019), “Implementation of ORB and Object Classification using KNN and SVM Classifiers”, Int. Journal of Computer Sciences and Engineering, vol. 7(3), pp. 280–285, doi: https://doi.org/10.26438/ijcse/v7i3.280285
Pražnikar, J. and Attygalle, N. (2022), “Quantitative analysis of visual codewords of a protein distance matrix”, PLoS One, vol. 17(2), doi: https://doi.org/10.1371/journal.pone.0263566
Sawant, M. S. and Rawat, C. S. (2020), “Feature Detection using KAZE and Harris Detectors for Ear Biometrics”, Int. Journal on Engineering Research and Technology, vol. 9(12), pp. 93–97, doi: https://doi.org/10.17577/IJERTV9IS120050
Zhu, J., Gong, C., M. Zhao, Wang, L. and Luo, Y. (2020), “Image mosaic algorithm based on PCA-ORB feature matching”, International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. XLII-3-W10,
pp. 83–89, doi: https://doi.org/10.5194/isprs-archives-XLII-3-W10-83-2020
Vinay, A., Kumar, Akshay C., Shenoy, Gaurav R., Murthy, K. N. Balasubramanaya and Natarajan, S. (2015), “ORB-PCA based feature extraction technique for face recognition”, Procedia Computer Science, vol. 58, pp. 614–621, doi: https://doi.org/10.1016/j.procs.2015.08.080
Wang X., Liu Z., Hu Y., Y, Xi, W., Yu, W. and Zou D. (2023), “Feature Booster: Boosting Feature Descriptors with a Lightweight Neural Network”, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), рp. 7630–7639, doi: https://doi.org/10.48550/arXiv.2211.15069
Gorokhovatskyi, V., Tvoroshenko, I., Kobylin, O. and Vlasenko, N. (2023), “Search for Visual Objects by Request in the Form of a Cluster Representation for the Structural Image Description”, Advances in Electrical and Electronic Engineering, vol. 21(1), pp. 19–27, doi: https://doi.org/10.15598/aeee.v21i1.4661
Daradkeh, Y. I., Gorokhovatskyi, V. I., Tvoroshenko, I. and Zeghid, M. (2022), “Cluster representation of the structural description of images for effective classification”, Computers, Materials & Continua, vol. 73(3), 2022, pp. 6069–6084, doi: https://doi.org/10.32604/cmc.2022.030254
Daradkeh, Y. I., Gorokhovatskyi, V., Tvoroshenko, I. and Zeghid, M. (2022), “Tools for Fast Metric Data Search in Structural Methods for Image Classification”, IEEE Access, vol. 10, 2022, pp. 124738-124746, doi: https://doi.org/10.1109/ACCESS.2022.3225077
Gorokhovatskyi, O. and Peredrii, O. (2023), “Image Pair Comparison for Near-duplicates Detection”, International Journal of Computing, vol. 22(1), pp. 51–57, doi: https://doi.org/10.47839/ijc.22.1.2879
Rublee, E., Rabaud, V., Konolige, K. and Bradski, G. (2011), “ORB: An efficient alternative to SIFT or SURF”, 2011 Int. Conference on Computer Vision, Barcelona, Spain, 2011, pp. 2564–2571, doi: https://doi.org/10.1109/ICCV.2011.6126544
Wang, J., Markert, K. and Everingham, M. (2009), “Learning Models for Object Recognition from Natural Language Descriptions”, Proc. of the 20th British Machine Vision Conf. (BMVC2009), pp. 1–11, doi: https://doi.org/10.5244/C.23.2
Wang, J., Markert, K. and Everingham, M. (2023), Leeds Butterfly DS, available at: https://www.josiahwang.com/dataset/leedsbutterfly
Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T. and Van Gool, L. (2005), “A Comparison of Affine Region Detectors”, IJCV, vol. 65, pp. 43–72, doi: https://doi.org/10.1007/s11263-005-3848-x
Mouats, T., Aouf, N., Nam, D. and Vidas, S. (2018), “Performance Evaluation of Feature Detectors and Descriptors Beyond the Visible”, Journal of Intelligent and Robotic Systems, vol. 92, pp. 33–63, doi: https://doi.org/10.1007/s10846-017-0762-8
Schmid, C., Mohr, R. and Bauckhage, C. (2000), “Evaluation of interest point detectors”, International Journal of Computer Vision, vol. 37(2), pp. 151–172, doi: https://doi.org/10.1023/A:1008199403446
Yakovleva, O. and Nikolaieva, K. (2020), “Research Of Descriptor Based Image Normalization And Comparative Analysis Of SURF, SIFT, BRISK, ORB, KAZE, AKAZE Descriptors”, Advanced Information Systems, vol. 4(4), pp. 89–101, doi: https://doi.org/10.20998/2522-9052.2020.4.13