Proceedings of the International Conference on Intelligent Information Systems Design and Indian Knowledge System Applications (ICISDIKSA 2026)

Speech Emotion Recognition using Constant-Q Transform and Deep Learning

Authors
S. Revathi 1, *, V. MuthuPriya2, R. Akila 1, Salman Faris Syed2, K. Mohammed Zahaan Dhande2
1Professor, Department of CSE, B.S.A. Crescent Institute of Science and Technology, Chennai, India
2UG, Department of CSE, B.S.A. Crescent Institute of Science and Technology, Chennai, India
*Corresponding author. Email: revathi@crescent.education
Corresponding Author
S. Revathi
Available Online 29 December 2025.
DOI
10.2991/978-94-6463-976-6_12How to use a DOI?
Keywords
Speech Emotion Recognition; Constant-Q Transform; Recurrent Neural Networks; Deep Learning; Emotion Classification
Abstract

In today’s digital world, machines are evolving beyond basic speech-to-text systems. This project develops an emotionally aware solution that recognizes both spoken words and the underlying emotions. The system combines Constant-Q Transform (CQT) for capturing time frequency audio features with Recurrent Neural Networks (RNNs) to analyze and classify emotional tones in real-time. By integrating emotional audio analysis into traditional speech recognition, it detects emotions including happiness, sadness, anger, fear, and neutrality directly from raw voice signals.

The model is trained on diverse emotional datasets and fine-tuned to recognize subtle variations in pitch, tone, and energy levels. Audio signals are transformed into spectrogram-like representations using CQT, then processed through RNN layers to capture temporal dynamics of emotional speech. A Django-based user interface provides an interactive platform where users can upload audio files or record voice directly through browsers. The system outputs transcribed text with identified emotions, visual cues, and probability scores.

This integration of audio signal processing, deep learning, and web technology creates a powerful tool with applications in mental health monitoring, therapy support, virtual assistants, call centers, and education.

Copyright
© 2025 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the International Conference on Intelligent Information Systems Design and Indian Knowledge System Applications (ICISDIKSA 2026)
Series
Advances in Intelligent Systems Research
Publication Date
29 December 2025
ISBN
978-94-6463-976-6
ISSN
1951-6851
DOI
10.2991/978-94-6463-976-6_12How to use a DOI?
Copyright
© 2025 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - S. Revathi 
AU  - V. MuthuPriya
AU  - R. Akila 
AU  - Salman Faris Syed
AU  - K. Mohammed Zahaan Dhande
PY  - 2025
DA  - 2025/12/29
TI  - Speech Emotion Recognition using Constant-Q Transform and Deep Learning
BT  - Proceedings of the International Conference on Intelligent Information Systems Design and Indian Knowledge System Applications (ICISDIKSA 2026)
PB  - Atlantis Press
SP  - 173
EP  - 183
SN  - 1951-6851
UR  - https://doi.org/10.2991/978-94-6463-976-6_12
DO  - 10.2991/978-94-6463-976-6_12
ID  - Revathi2025
ER  -