Proceedings of the International Conference on Advances and Applications in Artificial Intelligence (ICAAAI 2025)

Hybrid conditional VAE-CLSTM model for generating synchronized music from humming

Authors
N. Sandeep Chaitanya1, *, Chinthoju Rohith1, Dunna Shiva Prasad1, Golla Durgesh Yadav1, Natta Rishitha1
1Vallurupalli Nageswara Rao Vignana Jyothi Institute of Engineering &Technology, Hyderabad, Telangana, India
*Corresponding author. Email: sandeepchaitanya_n@vnrvjiet.in
Corresponding Author
N. Sandeep Chaitanya
Available Online 22 June 2025.
DOI
10.2991/978-94-6463-738-0_3How to use a DOI?
Abstract

Many aspiring musicians face challenges in creating their own music due to the high cost of equipment and the complexity of learning music theory. These barriers make it difficult for beginners to express their creative ideas, leading to frustration and disappointment. To address this issue, we propose a system that transforms simple humming into complete musical compositions, bridging the gap between artistic intention and the means to create music.

The system uses the CREPE model to extract key features, including frequencies, timestamps, and confidence levels, from a user’s hum in.wav format. This allows the system to capture the essence of the user’s musical ideas. To further enhance the composition, deep learning algorithms like CVAE (Conditional Variational Autoencoder) and CLSTM (Convolutional Long Short-Term Memory) are applied. The CVAE adds style to the music by inferring musical characteristics based on the user’s input, while the CLSTM enhances the music’s duration and flow, ensuring a full musical sequence.

This system enables users, regardless of their musical background or access to expensive equipment, to turn their hums into polished musical pieces. It democratizes the music creation process, making it easier, more affordable, and accessible to anyone with a passion for music. By simplifying the process of composition, this system opens up creative possibilities for people who otherwise might not have had the tools or knowledge to make music.

Copyright
© 2025 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the International Conference on Advances and Applications in Artificial Intelligence (ICAAAI 2025)
Series
Advances in Intelligent Systems Research
Publication Date
22 June 2025
ISBN
978-94-6463-738-0
ISSN
1951-6851
DOI
10.2991/978-94-6463-738-0_3How to use a DOI?
Copyright
© 2025 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - N. Sandeep Chaitanya
AU  - Chinthoju Rohith
AU  - Dunna Shiva Prasad
AU  - Golla Durgesh Yadav
AU  - Natta Rishitha
PY  - 2025
DA  - 2025/06/22
TI  - Hybrid conditional VAE-CLSTM model for generating synchronized music from humming
BT  - Proceedings of the International Conference on Advances and Applications in Artificial Intelligence (ICAAAI 2025)
PB  - Atlantis Press
SP  - 17
EP  - 31
SN  - 1951-6851
UR  - https://doi.org/10.2991/978-94-6463-738-0_3
DO  - 10.2991/978-94-6463-738-0_3
ID  - Chaitanya2025
ER  -