American Sign Language Text to Multi-lingual Speech Conversion Using Convolutional Neural Network

Bhramanand Sethi; Sarvednya Mhatre; Sachin Yadav; Sumedh Kudav; Dhanashri Bhosale

doi:10.2991/978-94-6463-852-3_26

<Previous Article In Volume

Next Article In Volume>

American Sign Language Text to Multi-lingual Speech Conversion Using Convolutional Neural Network

Authors

Bhramanand Sethi¹, Sarvednya Mhatre¹^{, *}, Sachin Yadav¹, Sumedh Kudav¹, Dhanashri Bhosale¹

¹Department of Computer Engineering, Ramrao Adik Institute of Technology, D. Y. Deemed University, Nerul, Navi Mumbai, India

^*Corresponding author. Email: sarvednya2@gmail.com

Corresponding Author

Sarvednya Mhatre

Available Online 7 October 2025.

DOI: 10.2991/978-94-6463-852-3_26 How to use a DOI?
Keywords: Sign Language; Machine Learning; American Sign Language (ASL); Convolutional Neural Network (CNN); Gesture Recognition; OpenCV; MediaPipe; Google Text-to-Speech (gTTS); Googletrans; Multilingual Support
Abstract: Sign language is a means of communication for people with hearing and speech impairments. But without the right translation tools, it’s a challenge to interact smoothly. To address this we use machine learning and natural language processing to convert American Sign Language (ASL) to text and speech in real time. This project uses a Convolutional Neural Network (CNN) model trained on a custom dataset of hand sign images for A-Z, space and delete commands. We used preprocessing steps like Resizing, Normalization, Cropping, Data Augmentation and Hand Landmark Detection using OpenCV and MediaPipe to ensure high quality inputs. The detected gestures are combined to form sentences and then converted to speech using Google Text-to-Speech (gTTS) library. Googletrans library also provides multilingual support to the system. Experimental results show that the CNN model is able to recognize sign language gestures with an accuracy of 97%. A user friendly interface provides features like real time output, sentence editing and audio output to provide a structured solution to bridge the communication gap. This system shows how technology can help people with hearing impairments to communicate.
Copyright: © 2025 The Author(s)
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the MULTINOVA: First International Conference on Artificial Intelligence in Engineering, Healthcare and Sciences (ICAIEHS- 2025)
Series: Advances in Intelligent Systems Research
Publication Date: 7 October 2025
ISBN: 978-94-6463-852-3
ISSN: 1951-6851
DOI: 10.2991/978-94-6463-852-3_26 How to use a DOI?
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

ris enw bib

TY  - CONF
AU  - Bhramanand Sethi
AU  - Sarvednya Mhatre
AU  - Sachin Yadav
AU  - Sumedh Kudav
AU  - Dhanashri Bhosale
PY  - 2025
DA  - 2025/10/07
TI  - American Sign Language Text to Multi-lingual Speech Conversion Using Convolutional Neural Network
BT  - Proceedings of the MULTINOVA: First International Conference on Artificial Intelligence in Engineering, Healthcare and Sciences (ICAIEHS- 2025)
PB  - Atlantis Press
SP  - 417
EP  - 426
SN  - 1951-6851
UR  - https://doi.org/10.2991/978-94-6463-852-3_26
DO  - 10.2991/978-94-6463-852-3_26
ID  - Sethi2025
ER  -

download .riscopy to clipboard