Exploring Emoji-based Synthetic Annotations for Filipino-English Sentiment Analysis
- DOI
- 10.2991/978-94-6239-638-8_26How to use a DOI?
- Keywords
- Emoji-based Sentiment Analysis; Synthetic Sentiment Labels; Philippine Natural Language Processing
- Abstract
Studies that conduct sentiment analysis on Filipino or a mix of Filipino and English text data suffer from a lack of readily available language resources. Many studies result in losing linguistic information when translating to a high-resource language or undergo the painstaking and expensive process of manual annotation. As an alternative, we propose an emoji-based sentiment annotation scheme as a language-independent means to automatically assess sentiment labels. The scheme utilizes an existing emoji sentiment lexicon and labels a text document as either positive or negative based on the mean sentiment score of the emojis present. The scheme can then be used on documents containing emojis to produce an initial set of data points to learn sentiment from. To evaluate the scheme’s effectiveness, we used our proposed tagging scheme on collected tweets from the Philippines that have emojis. We then conducted experiments in building a sentiment classification model centered around the usage of a convolutional neural network. We also trained our own Word2Vec model using tweets from the same domain. Our experiments showed that while a model trained with emojis achieved a higher kappa score, it suffered from overfitting. Hence, our best-performing model for generalizing on text data, which was trained with emojis removed, had a kappa score of 0.5331 and an F1 score of 0.7665. While there is still much room for improvement, our initial findings suggest that the emoji-based sentiment annotation scheme is a potential option to address the limited resources available for the task of modeling sentiment for Filipino and Filipino-English text data.
- Copyright
- © 2026 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Sean Timothy S. Co AU - Nicholas Rupert E. Custodio AU - Alexis Louis L.Dela Cruz AU - Martin Christopher B. Sanchez AU - Jason Jan C. Jabanes AU - Edward P. Tighe PY - 2026 DA - 2026/04/30 TI - Exploring Emoji-based Synthetic Annotations for Filipino-English Sentiment Analysis BT - Proceedings of the Workshop on Computation: Theory and Practice (WCTP 2025) PB - Atlantis Press SP - 520 EP - 534 SN - 2589-4900 UR - https://doi.org/10.2991/978-94-6239-638-8_26 DO - 10.2991/978-94-6239-638-8_26 ID - Co2026 ER -