Evaluating Sparse and Transformer-based Representations for Chinese Weibo Sentiment Analysis Across Data Scales and Noise Conditions

Yuying Zhao

doi:10.2991/978-94-6239-648-7_99

<Previous Article In Volume

Next Article In Volume>

Evaluating Sparse and Transformer-based Representations for Chinese Weibo Sentiment Analysis Across Data Scales and Noise Conditions

Authors

Yuying Zhao¹^{, *}

¹Institute of Education, University College London, London, WC1E 6BT, UK

^*Corresponding author. Email: Yuying.zhao.21@ucl.ac.uk

Corresponding Author

Yuying Zhao

Available Online 24 April 2026.

DOI: 10.2991/978-94-6239-648-7_99 How to use a DOI?
Keywords: Natural Language Processing; Sentiment Analysis; Chinese Weibo; TF-IDF; BERT
Abstract: This study presents a systematic comparison between traditional sparse representations and contextualised Transformer models for Chinese Weibo sentiment classification. Using a publicly available dataset of 10,500 annotated microblog posts, the analysis examines four key dimensions: text representation, data scale, noise robustness, and fine-tuning strategy. A character-level TF-IDF + Logistic Regression baseline is evaluated alongside a pretrained BERT model under controlled experimental conditions. Results show that BERT substantially outperforms TF-IDF when trained on the full dataset, achieving higher accuracy and macro-F1 through its ability to capture contextual and semantic nuances in noisy social-media text. However, in low-resource settings with only 1,000 training samples, TF-IDF remains competitive, narrowing the performance gap and demonstrating strong efficiency under data scarcity. Noise-robustness experiments further reveal that BERT maintains stable or improved performance under mild perturbations, while TF-IDF exhibits gradual degradation. Fine-tuning analysis confirms that full parameter updates are essential for BERT’s effectiveness, as freezing the encoder leads to significant performance declines. Overall, the findings provide a reproducible benchmark and practical guidance for selecting sentiment-analysis models under varying resource constraints, highlighting trade-offs between expressive power, robustness, and computational cost.
Copyright: © 2026 The Author(s)
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the International Workshop on Advances in Deep Learning for Image Analysis and Computer Vision (IWADIC 2025)
Series: Advances in Computer Science Research
Publication Date: 24 April 2026
ISBN: 978-94-6239-648-7
ISSN: 2352-538X
DOI: 10.2991/978-94-6239-648-7_99 How to use a DOI?
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

ris enw bib

TY  - CONF
AU  - Yuying Zhao
PY  - 2026
DA  - 2026/04/24
TI  - Evaluating Sparse and Transformer-based Representations for Chinese Weibo Sentiment Analysis Across Data Scales and Noise Conditions
BT  - Proceedings of the International Workshop on Advances in Deep Learning for Image Analysis and Computer Vision (IWADIC 2025)
PB  - Atlantis Press
SP  - 923
EP  - 933
SN  - 2352-538X
UR  - https://doi.org/10.2991/978-94-6239-648-7_99
DO  - 10.2991/978-94-6239-648-7_99
ID  - Zhao2026
ER  -

download .riscopy to clipboard