Evaluating Sparse and Transformer-based Representations for Chinese Weibo Sentiment Analysis Across Data Scales and Noise Conditions
- DOI
- 10.2991/978-94-6239-648-7_99How to use a DOI?
- Keywords
- Natural Language Processing; Sentiment Analysis; Chinese Weibo; TF-IDF; BERT
- Abstract
This study presents a systematic comparison between traditional sparse representations and contextualised Transformer models for Chinese Weibo sentiment classification. Using a publicly available dataset of 10,500 annotated microblog posts, the analysis examines four key dimensions: text representation, data scale, noise robustness, and fine-tuning strategy. A character-level TF-IDF + Logistic Regression baseline is evaluated alongside a pretrained BERT model under controlled experimental conditions. Results show that BERT substantially outperforms TF-IDF when trained on the full dataset, achieving higher accuracy and macro-F1 through its ability to capture contextual and semantic nuances in noisy social-media text. However, in low-resource settings with only 1,000 training samples, TF-IDF remains competitive, narrowing the performance gap and demonstrating strong efficiency under data scarcity. Noise-robustness experiments further reveal that BERT maintains stable or improved performance under mild perturbations, while TF-IDF exhibits gradual degradation. Fine-tuning analysis confirms that full parameter updates are essential for BERT’s effectiveness, as freezing the encoder leads to significant performance declines. Overall, the findings provide a reproducible benchmark and practical guidance for selecting sentiment-analysis models under varying resource constraints, highlighting trade-offs between expressive power, robustness, and computational cost.
- Copyright
- © 2026 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Yuying Zhao PY - 2026 DA - 2026/04/24 TI - Evaluating Sparse and Transformer-based Representations for Chinese Weibo Sentiment Analysis Across Data Scales and Noise Conditions BT - Proceedings of the International Workshop on Advances in Deep Learning for Image Analysis and Computer Vision (IWADIC 2025) PB - Atlantis Press SP - 923 EP - 933 SN - 2352-538X UR - https://doi.org/10.2991/978-94-6239-648-7_99 DO - 10.2991/978-94-6239-648-7_99 ID - Zhao2026 ER -