Machine Learning-Based Detection of SMS Phishing in the Philippine Context
- DOI
- 10.2991/978-94-6239-638-8_28How to use a DOI?
- Keywords
- SMS; Smishing; Machine Learning; Information Security; Phishing
- Abstract
Short Message Service (SMS) phishing, or smishing, is an escalating cybersecurity threat in the Philippines, where widespread mobile usage intersects with informal language and frequent code-switching between Filipino and English. While prior studies have primarily focused on English-language datasets, limited research exists that directly addresses the linguistic complexities and cultural nuances unique to the Filipino context. This study aims to bridge this gap by developing a machine learning-based detection system optimized for smishing in the Philippines. A labeled dataset comprising both phishing and legitimate messages was collected from public sources and surveys, then preprocessed using natural language processing techniques such as tokenization, lemmatization, and TF-IDF vectorization. The study implemented and evaluated five classical machine learning classifiers: Support Vector Machines (SVM), Logistic Regression, Random Forest, K-Nearest Neighbors (KNN), and Multinomial Naive Bayes (MNB). Among these, the SVM model combined with the TF-IDF features achieved the highest performance, recording an F-score of 99.20%, indicating robust precision and recall. These findings affirm the effectiveness of language-aware, content-based smishing detection tailored to the Filipino language, offering a foundational contribution to the development of inclusive, culturally adaptive cybersecurity systems and highlighting the importance of extending protection beyond Anglocentric models.
- Copyright
- © 2026 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - John Exequiel A. Corpuz AU - Sebastian Q. Diaz AU - Luis Benedict M. Palafox AU - Mark Christian T. Tan AU - Katrina Ysabel C. Solomon PY - 2026 DA - 2026/04/30 TI - Machine Learning-Based Detection of SMS Phishing in the Philippine Context BT - Proceedings of the Workshop on Computation: Theory and Practice (WCTP 2025) PB - Atlantis Press SP - 553 EP - 562 SN - 2589-4900 UR - https://doi.org/10.2991/978-94-6239-638-8_28 DO - 10.2991/978-94-6239-638-8_28 ID - Corpuz2026 ER -