SHORT MESSAGE SERVICE SPAM DETECTION USING MACHINE LEARNING

DINO, FETIYA

st. Mary's University Institutional Repository

Please use this identifier to cite or link to this item: http://hdl.handle.net/123456789/8202

Title:	SHORT MESSAGE SERVICE SPAM DETECTION USING MACHINE LEARNING
Authors:	DINO, FETIYA
Keywords:	SMS, Spam Detection, Machine Learning, Natural Language Processing, NB, SVM, KNN, RF, AdaBoost
Issue Date:	Jul-2024
Publisher:	St. Mary’s University
Abstract:	The usage of mobile phones has deeply integrated in society’s modern life. Short Message Service (SMS), as a prevalent and cost-effective mode of telecommunication, is currently among the most extensively used methods of communicating with one another. But this ease of use has also led to the growth of SMS spam, which seriously jeopardizes the dependability and integrity of mobile communication. To solve this issue, we suggested a machine learning-based solution for effectively distinguishing genuine "ham" communications from malignant "spam" ones in the SMS communication space. The techniques use the SMS Spam Collection dataset and machine learning classifiers such as M-NB, SVM, KNN, RF, and AB algorithms to categorize short messages as ham or spam. The machine learning-based spam detection approach demonstrated impressive performance, demonstrating how well it works to detect messages that are spam in communications on mobile devices. The careful data preprocessing and feature engineering steps were instrumental in building a robust and accurate spam detection model. Thoroughly cleaning and transforming the SMS collection data through techniques like removing stopWords, punctuation, text normalization and feature selections were crucial for preparing the SMS dataset to be effectively leveraged by the machine learning algorithm. These data preparation and feature engineering efforts were essential for overcoming the unique challenges of SMS data to create an effective spam detection algorithm that can recognize unsolicited SMS messages on mobile devices. After implementing and evaluating such proposed models, our evaluation performance measures yielded remarkable results, with the SVM model emerging as the top performer in the MLbased spam detection system with 98.3% accuracy, 100% precision, 96% recall, and 91% F1-score.
URI:	http://hdl.handle.net/123456789/8202
Appears in Collections:	Master of computer science

Files in This Item:

File	Description	Size	Format
FETIYA_DINO_SHORT_MESSAGE_SERVICE_SPAM_DETECTION_USING_MACHINE_LEARNING.pdf		1.44 MB	Adobe PDF	View/Open

Show full item record