Fusing Fixed and Adaptive Multi-resolution Features: A DWT-EWT Approach for Improved Speech Emotion Classification
- DOI
- 10.2991/978-94-6239-707-1_18How to use a DOI?
- Keywords
- DWT; EWT; SEC; Feature Fusion; Multi-resolution Analysis
- Abstract
Speech emotion classification (SEC) is the automatic identification process of the emotional states that are inherent parts of any utterance with the help of computer programming with high potential applications in the domain of medicine, security, surveillance, digital marketing, E-learning, internet search, personal communication, customer relation mechanisms, human-computer interaction, etc. Recent advances in speech emotion classification performance (ECP) employed various acoustic as well as non-acoustic features with the help of machine learning as well as deep learning algorithms. This paper introduces a new computing mechanism with the help of the hybrid approach of Discrete Wavelet Transform (DWT) and Empirical Wavelet Transform (EWT) with the intention to increase the classification accuracy level with the help of the proposed hybrid signal decomposition and feature fusion technique. Speech signals are broken into frames, that are then decomposed into four modes with the help of the proposed approach. i.e using DWT and EWT, followed by the extraction of five different entropy-based features namely “Approximate Entropy (ApE)”, “Permutation Entropy (PrE)”, “Increment Entropy (InE)”, “Sample Entropy (SaE)”, “Spectral Entropy (SpE)”, collectively termed Hybrid-Entropy (HEn) features and HMFCC features (Hybrid-Mel-Frequency Cepstral Coefficient). Experimental evaluation using a deep neural network (DNN) classifier demonstrates that combining HEn with HMFCC features derived from both decomposed modes achieves superior performance, attaining an accuracy of 89.76% on the EMODB dataset.
- Copyright
- © 2026 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Devi Prasad Pattnaik AU - Bala Sai Srilatha Indira Dutt Vemuri PY - 2026 DA - 2026/06/18 TI - Fusing Fixed and Adaptive Multi-resolution Features: A DWT-EWT Approach for Improved Speech Emotion Classification BT - Proceedings of the International Conference on Recent Advances in Intelligent and Sustainable Technologies (RAIST 2026) PB - Atlantis Press SP - 211 EP - 221 SN - 2589-4919 UR - https://doi.org/10.2991/978-94-6239-707-1_18 DO - 10.2991/978-94-6239-707-1_18 ID - Pattnaik2026 ER -