A Novel Approach for Enhancing Child Speech Synthesis Using LIESS Algorithm
- DOI
- 10.2991/978-94-6239-616-6_76How to use a DOI?
- Keywords
- LIESS; ASR; HiFi-GAN; WER; LLM
- Abstract
In today’s rapidly evolving technological landscape, voice assistants have become an integral part of everyday life. However, child-specific voice assistants with natural and expressive child-like speech remain limited. Existing solutions often fail to achieve the expected level of naturalness, clarity, and efficiency, making interactions less engaging and familiar for young users. This project introduces a novel LLM-Infused Expressive Speech Synthesis (LIESS) Algorithm, which integrates Large Language Models (LLMs) and diffusion-based Text-to- Speech (TTS) techniques to enhance child speech synthesis. By leveraging diffusion models for parallel spectrogram generation and HiFi-GAN + FastPitch for refined waveform synthesis, the proposed system generates highly natural and expressive child-like voices. The workflow includes real-time speech capture via Gradio UI, speech-to-text conversion using Deepgram ASR, and emotion-aware response generation through Gemini API. Collectively these three processes are called as Voice-to-Language Engine (VTLE). To ensure high-quality output in terms of clarity and intelligibility, the synthesized speech undergoes Word Error Rate (WER) evaluations, assessing accuracy in speech recognition and linguistic precision. This project aims to revolutionize child-specific AI voice assistants by creating a more engaging, interactive, and accessible system. The innovation enhances speech applications in education, entertainment, and assistive technologies, bridging the gap between young users and AI-driven voice interactions.
- Copyright
- © 2026 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - N. Danapaquiame AU - M. Shanmugam AU - Arokiaraj Christian St. Hubert AU - M. Aishwariya Lakshmi PY - 2026 DA - 2026/03/31 TI - A Novel Approach for Enhancing Child Speech Synthesis Using LIESS Algorithm BT - Proceedings of the International Conference on Artificial Intelligence and Secure Data Analytics (ICAISDA 2025) PB - Atlantis Press SP - 1037 EP - 1056 SN - 1951-6851 UR - https://doi.org/10.2991/978-94-6239-616-6_76 DO - 10.2991/978-94-6239-616-6_76 ID - Danapaquiame2026 ER -