DETECTION OF SPAM MESSAGES USING MACHINE LEARNING ALGORITHMS

DETECTION OF SPAM MESSAGES USING MACHINE LEARNING ALGORITHMS

Authors

  • A.B. Aben Akhmet Yassawi International Kazakh-Turkish University, Turkistan, Kazakhstan
  • N.M. Zhunissov Akhmet Yassawi International Kazakh-Turkish University, Turkistan, Kazakhstan
  • Zh.B. Myrzatayev Akhmet Yassawi International Kazakh-Turkish University, Turkistan, Kazakhstan

DOI:

https://doi.org/10.55956/NXZX2857

Keywords:

мachine learning, spam filtering, text classification, Naive Bayes, logistic regression, SVM model, data preprocessing, gradient boosting

Abstract

This study analyzes methods for detecting spam through the automatic classification of text messages. The aim of the research was to evaluate the effectiveness of machine learning models and to explore their applicability in spam filtering systems. Preprocessing steps included converting messages to lowercase, tokenization, removing special characters and punctuation, eliminating common words, and stemming. These steps helped to highlight important information from the messages and enhance the quality of classification. During the study, models such as Naive Bayes, Logistic Regression, SVM, Decision Tree, Random Forest, and Gradient Boosting were tested. The results showed that the SVM model achieved the highest metrics across all measures, while the Random Forest and Logistic Regression models also demonstrated high effectiveness. These models attained over 95% in accuracy, precision, recall, and F1-score. The findings of the research demonstrate the effectiveness of machine learning techniques in classifying textual data and indicate their potential application in real-world systems, such as filtering messages received via email or SMS. Future work aims to optimize hyperparameters and apply advanced methods for processing textual data to enhance the performance of the models. 

Downloads

Published online

2024-12-30

Issue

Section

Information аnd communication technologies
Loading...