Hybrid conditional VAE-CLSTM model for generating synchronized music from humming

N. Sandeep Chaitanya; Chinthoju Rohith; Dunna Shiva Prasad; Golla Durgesh Yadav; Natta Rishitha

doi:10.2991/978-94-6463-738-0_3

<Previous Article In Volume

Next Article In Volume>

Hybrid conditional VAE-CLSTM model for generating synchronized music from humming

Authors

N. Sandeep Chaitanya¹^{, *}, Chinthoju Rohith¹, Dunna Shiva Prasad¹, Golla Durgesh Yadav¹, Natta Rishitha¹

¹Vallurupalli Nageswara Rao Vignana Jyothi Institute of Engineering &Technology, Hyderabad, Telangana, India

^*Corresponding author. Email: sandeepchaitanya_n@vnrvjiet.in

Corresponding Author

N. Sandeep Chaitanya

Available Online 22 June 2025.

DOI: 10.2991/978-94-6463-738-0_3 How to use a DOI?
Abstract: Many aspiring musicians face challenges in creating their own music due to the high cost of equipment and the complexity of learning music theory. These barriers make it difficult for beginners to express their creative ideas, leading to frustration and disappointment. To address this issue, we propose a system that transforms simple humming into complete musical compositions, bridging the gap between artistic intention and the means to create music.

The system uses the CREPE model to extract key features, including frequencies, timestamps, and confidence levels, from a user’s hum in.wav format. This allows the system to capture the essence of the user’s musical ideas. To further enhance the composition, deep learning algorithms like CVAE (Conditional Variational Autoencoder) and CLSTM (Convolutional Long Short-Term Memory) are applied. The CVAE adds style to the music by inferring musical characteristics based on the user’s input, while the CLSTM enhances the music’s duration and flow, ensuring a full musical sequence.

This system enables users, regardless of their musical background or access to expensive equipment, to turn their hums into polished musical pieces. It democratizes the music creation process, making it easier, more affordable, and accessible to anyone with a passion for music. By simplifying the process of composition, this system opens up creative possibilities for people who otherwise might not have had the tools or knowledge to make music.
Copyright: © 2025 The Author(s)
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the International Conference on Advances and Applications in Artificial Intelligence (ICAAAI 2025)
Series: Advances in Intelligent Systems Research
Publication Date: 22 June 2025
ISBN: 978-94-6463-738-0
ISSN: 1951-6851
DOI: 10.2991/978-94-6463-738-0_3 How to use a DOI?
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

ris enw bib

TY  - CONF
AU  - N. Sandeep Chaitanya
AU  - Chinthoju Rohith
AU  - Dunna Shiva Prasad
AU  - Golla Durgesh Yadav
AU  - Natta Rishitha
PY  - 2025
DA  - 2025/06/22
TI  - Hybrid conditional VAE-CLSTM model for generating synchronized music from humming
BT  - Proceedings of the International Conference on Advances and Applications in Artificial Intelligence (ICAAAI 2025)
PB  - Atlantis Press
SP  - 17
EP  - 31
SN  - 1951-6851
UR  - https://doi.org/10.2991/978-94-6463-738-0_3
DO  - 10.2991/978-94-6463-738-0_3
ID  - Chaitanya2025
ER  -

download .riscopy to clipboard