Proceeding of the 1st International Conference on Lifespan Innovation (ICLI 2025)

A Multilingual Web-Based Approach to Speech Emotion Recognition: Challenges and Solutions

Authors
Arundhati Wani1, *, Rujuta Kulkarni1, Ananthan Nair1, Vishvjita Savkare1, Punam Chavan1
1Marathwada Mitra Mandal’s College of Engineering, Pune, India
*Corresponding author. Email: arundhatim.wani@gmail.com
Corresponding Author
Arundhati Wani
Available Online 31 August 2025.
DOI
10.2991/978-94-6463-831-8_40How to use a DOI?
Keywords
speech emotion recognition; HuBERT; RAVDESS; EMO-DB
Abstract

Speech Emotion Recognition (SER) has gained significant attention in recent years due to its broad range of applications in areas such as human-computer interaction, mental health monitoring, virtual assistants, and customer service. Accurately recognizing human emotions from speech signals can greatly enhance user experience and system responsiveness. However, several critical challenges continue to hinder the development of robust and generalizable SER models. These include the lack of multilingual integration, limited support for real-time processing, cultural and dialectal variations in emotional expression, the absence of standardized evaluation metrics, scarcity and imbalance in publicly available datasets, and the complexities involved in recognizing emotions from multiple users simultaneously. To address some of these issues, this paper presents a web-based application for SER that leverages HuBERT, a self-supervised speech representation model, for multilingual emotion classification. Our system is capable of identifying a wide range of emotional states, including happiness, sadness, anger, neutrality, fear, disgust, boredom, anxiety, surprise, and calmness. The frontend of the application is built using Angular, ensuring a responsive and user-friendly interface, while the backend is powered by FastAPI, enabling efficient API communication and seamless integration of user feedback. The system is trained and evaluated using two well-established emotional speech datasets—RAVDESS and German EMO-DB—which together enhance its generalizability across different languages and cultural contexts. By combining modern deep learning techniques with a practical web-based deployment, this work aims to bridge the gap between SER research and real-world applications, offering a scalable and interactive solution for multilingual emotion recognition.

Copyright
© 2025 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceeding of the 1st International Conference on Lifespan Innovation (ICLI 2025)
Series
Advances in Health Sciences Research
Publication Date
31 August 2025
ISBN
978-94-6463-831-8
ISSN
2468-5739
DOI
10.2991/978-94-6463-831-8_40How to use a DOI?
Copyright
© 2025 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - Arundhati Wani
AU  - Rujuta Kulkarni
AU  - Ananthan Nair
AU  - Vishvjita Savkare
AU  - Punam Chavan
PY  - 2025
DA  - 2025/08/31
TI  - A Multilingual Web-Based Approach to Speech Emotion Recognition: Challenges and Solutions
BT  - Proceeding of the 1st International Conference on Lifespan Innovation (ICLI 2025)
PB  - Atlantis Press
SP  - 331
EP  - 338
SN  - 2468-5739
UR  - https://doi.org/10.2991/978-94-6463-831-8_40
DO  - 10.2991/978-94-6463-831-8_40
ID  - Wani2025
ER  -