Proceedings of the International Conference on Artificial Intelligence and Secure Data Analytics (ICAISDA 2025)

A Novel Approach for Enhancing Child Speech Synthesis Using LIESS Algorithm

Authors
N. Danapaquiame1, M. Shanmugam2, *, Arokiaraj Christian St. Hubert3, M. Aishwariya Lakshmi4
1Head of Department, Department of Computer Science and Engineering, Sri Manakula Vinayagar Engineering College, Puducherry, India
2Associate Professor, Department of Computer Science and Engineering, Sri Manakula Vinayagar Engineering College, Puducherry, India
3Assistant Professor, Department of Computer Science and Engineering, Sri Manakula Vinayagar Engineering College, Puducherry, India
4UG Student, Department of Computer Science and Engineering, Sri Manakula Vinayagar Engineering College, Puducherry, India
*Corresponding author. Email: shanmugam.muthalu@gmail.com
Corresponding Author
M. Shanmugam
Available Online 31 March 2026.
DOI
10.2991/978-94-6239-616-6_76How to use a DOI?
Keywords
LIESS; ASR; HiFi-GAN; WER; LLM
Abstract

In today’s rapidly evolving technological landscape, voice assistants have become an integral part of everyday life. However, child-specific voice assistants with natural and expressive child-like speech remain limited. Existing solutions often fail to achieve the expected level of naturalness, clarity, and efficiency, making interactions less engaging and familiar for young users. This project introduces a novel LLM-Infused Expressive Speech Synthesis (LIESS) Algorithm, which integrates Large Language Models (LLMs) and diffusion-based Text-to- Speech (TTS) techniques to enhance child speech synthesis. By leveraging diffusion models for parallel spectrogram generation and HiFi-GAN + FastPitch for refined waveform synthesis, the proposed system generates highly natural and expressive child-like voices. The workflow includes real-time speech capture via Gradio UI, speech-to-text conversion using Deepgram ASR, and emotion-aware response generation through Gemini API. Collectively these three processes are called as Voice-to-Language Engine (VTLE). To ensure high-quality output in terms of clarity and intelligibility, the synthesized speech undergoes Word Error Rate (WER) evaluations, assessing accuracy in speech recognition and linguistic precision. This project aims to revolutionize child-specific AI voice assistants by creating a more engaging, interactive, and accessible system. The innovation enhances speech applications in education, entertainment, and assistive technologies, bridging the gap between young users and AI-driven voice interactions.

Copyright
© 2026 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the International Conference on Artificial Intelligence and Secure Data Analytics (ICAISDA 2025)
Series
Advances in Intelligent Systems Research
Publication Date
31 March 2026
ISBN
978-94-6239-616-6
ISSN
1951-6851
DOI
10.2991/978-94-6239-616-6_76How to use a DOI?
Copyright
© 2026 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - N. Danapaquiame
AU  - M. Shanmugam
AU  - Arokiaraj Christian St. Hubert
AU  - M. Aishwariya Lakshmi
PY  - 2026
DA  - 2026/03/31
TI  - A Novel Approach for Enhancing Child Speech Synthesis Using LIESS Algorithm
BT  - Proceedings of the International Conference on Artificial Intelligence and Secure Data Analytics (ICAISDA 2025)
PB  - Atlantis Press
SP  - 1037
EP  - 1056
SN  - 1951-6851
UR  - https://doi.org/10.2991/978-94-6239-616-6_76
DO  - 10.2991/978-94-6239-616-6_76
ID  - Danapaquiame2026
ER  -