ANALYSIS OF THE TEXT PREPROCESSING METHODS INFLUENCE ON THE DESTRUCTIVE MESSAGES CLASSIFIER
Main Article Content
Abstract
Article Details
References
(2020), Social Network Ranking, available at: https://www.statista.com/statistics/272014/global-social-networksranked-by-number-of-users/.
Dadvar. M., Trieschnigg. D., Ordelman. R. and de Jong, F. (2013), “Improving Cyberbullying Detection with User Context”, Serdyukov P. et al. (eds), Advances in Information Retrieval. ECIR 2013, Lecture Notes in Computer Science, vol 7814. Springer, Berlin, Heidelberg.
Salminen, J., Almerekhi, H., Milenkovic, M., Jung, S., An, J., Kwak, H., & Jansen, B.J. (2017), “Anatomy of Online Hate: Developing a Taxonomy and Machine Learning Models for Identifying and Classifying Hate”, Online News Media. ICWSM.
Shtovba, S. D., Shtovba, O. V., Yakhymovych, O. V. and Petrychko, M. V. (2019), “Vplyv syntaksychnykh zviazkiv u rechenniakh na yakist identyfikatsii toksychnykh komentariv v sotsialnii merezhi”, Informatsiini tekhnolohii ta kompiuterna tekhnika, VNTU, Vinnytsia, No. 4, DOI: https://doi.org/10.31649/2307-5376-2019-4-35-42.
Pavlopoulos, J., Sorensen, J., Dixon, L., Thain, N., & Androutsopoulos, I. (2020), “Toxicity Detection: Does Context Really Matter?”, arXiv preprint, arXiv: 2006.00998.
Noever, D. (2018), “Machine learning suites for online toxicity detection”, arXiv preprint, arXiv:1810.01869.
van Aken, B., Risch, J., Krestel, R., & Löser, A. (2018), “Challenges for toxic comment classification: An in-depth error analysis”, arXiv preprint, arXiv:1809.07572.
Mohammad, Fahim (2018), “Is preprocessing of text really worth your time for toxic comment classification?”, Proceedings on the International Conference on Artificial Intelligence (ICAI), The Steering Committee of The World Congress in Computer Science, Computer Engineering and Applied Computing (WorldComp), pp. 447-453.
(2020), Toxic Comment Classification Challenge, available at:
https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge/data.
(2020), Russian Language Toxic Comments. Small dataset with labeled comments from 2ch.hk and pikabu.ru, available at: https://www.kaggle.com/blackmoon/russian-language-toxic-comments.
(2020), Tackling Toxic Using Keras, available at: https://www.kaggle.com/sbongo/for-beginners-tackling-toxic-using-keras.
(2020), An Intuitive Understanding of Word Embeddings: From Count Vectors to Word2Vec , available at:
https://www.analyticsvidhya.com/blog/2017/06/word-embeddings-count-word2veec/.