Machine Learning-Based Detection of SMS Phishing in the Philippine Context

John Exequiel A. Corpuz; Sebastian Q. Diaz; Luis Benedict M. Palafox; Mark Christian T. Tan; Katrina Ysabel C. Solomon

doi:10.2991/978-94-6239-638-8_28

<Previous Article In Volume

Next Article In Volume>

Machine Learning-Based Detection of SMS Phishing in the Philippine Context

Authors

John Exequiel A. Corpuz¹, Sebastian Q. Diaz¹, Luis Benedict M. Palafox¹, Mark Christian T. Tan¹, Katrina Ysabel C. Solomon¹^{, *}

¹Advanced Research Institute for Informatics, Computing, and Networking, De La Salle University, 2401 Taft Avenue, Manila, 1004, Philippines

^*Corresponding author. Email: katrina.solomon@dlsu.edu.ph

Corresponding Author

Katrina Ysabel C. Solomon

Available Online 30 April 2026.

DOI: 10.2991/978-94-6239-638-8_28 How to use a DOI?
Keywords: SMS; Smishing; Machine Learning; Information Security; Phishing
Abstract: Short Message Service (SMS) phishing, or smishing, is an escalating cybersecurity threat in the Philippines, where widespread mobile usage intersects with informal language and frequent code-switching between Filipino and English. While prior studies have primarily focused on English-language datasets, limited research exists that directly addresses the linguistic complexities and cultural nuances unique to the Filipino context. This study aims to bridge this gap by developing a machine learning-based detection system optimized for smishing in the Philippines. A labeled dataset comprising both phishing and legitimate messages was collected from public sources and surveys, then preprocessed using natural language processing techniques such as tokenization, lemmatization, and TF-IDF vectorization. The study implemented and evaluated five classical machine learning classifiers: Support Vector Machines (SVM), Logistic Regression, Random Forest, K-Nearest Neighbors (KNN), and Multinomial Naive Bayes (MNB). Among these, the SVM model combined with the TF-IDF features achieved the highest performance, recording an F-score of 99.20%, indicating robust precision and recall. These findings affirm the effectiveness of language-aware, content-based smishing detection tailored to the Filipino language, offering a foundational contribution to the development of inclusive, culturally adaptive cybersecurity systems and highlighting the importance of extending protection beyond Anglocentric models.
Copyright: © 2026 The Author(s)
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the Workshop on Computation: Theory and Practice (WCTP 2025)
Series: Atlantis Highlights in Computer Sciences
Publication Date: 30 April 2026
ISBN: 978-94-6239-638-8
ISSN: 2589-4900
DOI: 10.2991/978-94-6239-638-8_28 How to use a DOI?
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

ris enw bib

TY  - CONF
AU  - John Exequiel A. Corpuz
AU  - Sebastian Q. Diaz
AU  - Luis Benedict M. Palafox
AU  - Mark Christian T. Tan
AU  - Katrina Ysabel C. Solomon
PY  - 2026
DA  - 2026/04/30
TI  - Machine Learning-Based Detection of SMS Phishing in the Philippine Context
BT  - Proceedings of the  Workshop on Computation: Theory and Practice (WCTP 2025)
PB  - Atlantis Press
SP  - 553
EP  - 562
SN  - 2589-4900
UR  - https://doi.org/10.2991/978-94-6239-638-8_28
DO  - 10.2991/978-94-6239-638-8_28
ID  - Corpuz2026
ER  -

download .riscopy to clipboard