Keywords: | SMS, Spam Detection, Machine Learning, Natural Language Processing, NB, SVM, KNN, RF, AdaBoost |
Abstract: | The usage of mobile phones has deeply integrated in society’s modern life. Short Message
Service (SMS), as a prevalent and cost-effective mode of telecommunication, is currently
among the most extensively used methods of communicating with one another. But this
ease of use has also led to the growth of SMS spam, which seriously jeopardizes the
dependability and integrity of mobile communication. To solve this issue, we suggested a
machine learning-based solution for effectively distinguishing genuine "ham"
communications from malignant "spam" ones in the SMS communication space. The
techniques use the SMS Spam Collection dataset and machine learning classifiers such as
M-NB, SVM, KNN, RF, and AB algorithms to categorize short messages as ham or spam.
The machine learning-based spam detection approach demonstrated impressive
performance, demonstrating how well it works to detect messages that are spam in
communications on mobile devices. The careful data preprocessing and feature engineering
steps were instrumental in building a robust and accurate spam detection model.
Thoroughly cleaning and transforming the SMS collection data through techniques like
removing stopWords, punctuation, text normalization and feature selections were crucial
for preparing the SMS dataset to be effectively leveraged by the machine learning
algorithm. These data preparation and feature engineering efforts were essential for
overcoming the unique challenges of SMS data to create an effective spam detection
algorithm that can recognize unsolicited SMS messages on mobile devices. After
implementing and evaluating such proposed models, our evaluation performance measures
yielded remarkable results, with the SVM model emerging as the top performer in the MLbased spam detection system with 98.3% accuracy, 100% precision, 96% recall, and 91%
F1-score. |