VOCALOIDS: Re-imagining AI in Music

Illustration by Tejasvi Birdh, UG 25, Plaksha University

Illustration by Tejasvi Birdh, UG 25, Plaksha University

Zoya Ghoshal

Vocaloid. An interesting term, right? Probably a word you never heard of before. But the term ‘vocal’ should ring a few bells.

A Vocaloid, in essence, is a new type of voice synthesizing technology using the principles and the concepts of Artificial Intelligence (AI), to create a whole new voice so perfectly realistic, that it sounds as if it belongs to a real human being. The basic premise of this software allows the user– or a more fitting term, artist– to set the general mood of a song in accordance with the lyrics they have written. While a lot of the work depends on the software itself, the general outcome and resounding quality of the final song depends entirely on the user. It is a medium for an artist to be able to express themself in a way that is profound, relatable and sometimes poignant. But how exactly is a software like this useful for the general population? Can it really be something significant to us as a community?

Since the release of the current most famous vocaloid–Hatsune Miku–the popularity of AI-generated music has skyrocketed to previously unreachable heights in Japan. There is a large number of extremely gifted artists who have emerged and flourished in the music industry due to the introduction of this software. That being said, the elements of a song and the work that goes into its creation is no small feat. Every artist has a separate style and emotion that they want to express and hence, the final song completely depends on the person creating it. The AI essentially facilitates the exploration of the creativity of an artist. Some of the particulars in a song that the software takes inputs for are the pitch, the vibrato, the deepness, the rhythm and of course, the lyrics. How can we work on these factors without a voice? The fascinating part of this predominantly automated process is that the foundation voice which the software bases its modifications on is provided by a real person.

While we do have a voice and the software to modify it, the modifications to the voice and, finally, the creation of our ‘artificial singer’ needs to be done by people as well. These people are the producers of the songs for the vocaloid. They handle and oversee everything related to the general production of a song. Phenomenal singers like Yonezu Kenshi and YOASOBI emerged due to vocaloids and now, they top Japan’s charts every year. A new term was coined for creators like these. They were called ‘Vocalo-P’s.

Hatsune Miku was the first greatest vocaloid who changed the lives of artists in Japan for generations to come. Vocaloids, in essence, were created to be perceived as a whole newentity, a singer in their own right. Hatsune’s instantaneous popularity was firstly, due to her voice being much smoother than her predecessors and secondly, the alteration of her personality being made possible in order to make her aura fit with the music. After her revolutionary creation, the popularity of Vocaloids rose so much that even concerts with AI holograms as the performers were conducted, often with thousands of people in attendance.

While the music scene in India is diverse and rich, you don’t really find a whole lot of people who venture into listening to music in foreign languages. The introduction of vocaloids has the potential to be an enormous success. Indians, nowadays, tend to predominantly listen to dance numbers, rather than songs which require deep introspection. The only songs which actually have meaningful lyrics are love songs. So, even though we have such a rich culture, our scope of music variety is a little restricted. 

A majority of the people in India have music as an integral part of their lives. An Indian, on average, probably listens to about 19 hours of music a week. That equates to about 2.7 hours a day, which means listening to 54 three-minute songs daily, according to the Digital Music Study 2019 conducted by the Indian Music Industry (IMI). It’s not surprising that 80 percent of people in India identified themselves as ‘music fanatics’.

Also, in regard to the types of music being listened to, predictably, Bollywood music was the most preferred genre of music. Even this type has its forms, namely, new Bollywood and vintage Bollywood, the nuances of which are self-explanatory. Another popular genre was called the ‘Oldies’ genre which primarily consisted of music from older generations, in various Indian languages.

With that said, while some do have their preferences, the music scene in India is an ever-changing canvas. Our music never conforms to a standard the world may have set. While being radically different from music circulated globally, Indian music has the potential to be universal. The catchiness of our music can make the listener lose themselves in dance, while also giving them a glimpse into our culture.

Largely, Indians tend to stick to their roots and don’t usually deviate from listening to music made in India, which is understandable considering the variety available in our country itself. Vocaloids are a different concept altogether, with a majority of Indians probably not even being aware of its existence, exceedingly different from its popularity in Japan.

Music has always been a medium for the expression of thoughts and of feelings. Artists use the vocaloid software as a means to let out their emotions, whether they be negative, positive or a mix of both. Vocalo-Ps are able to create songs that tell a plethora of elaborate and meaningful stories. The creation of these songs can touch the hearts of many, while simultaneously reducing the burden on the ones creating them. Artists in Japan use this music to broach sensitive topics like domestic violence, suicide, mental illness, the failings of the governmental system and others. They create music which focuses more on the lyrics and leave people with a topic to contemplate on, a long while after the song is over. Indian music has the potential to be powerful. Increasing the knowledge of the world’s issues in the minds of the general population can impact people’s views of the changing world. From the way we write love songs, we would be exemplary at coming up with music with lyrics that can revolutionize people’s thinking.

The concept of an artificial being creating music may sound strange, but it can open the minds of many through the exploration of a new type of music. Vocaloids can be a way for amateurs to create their own music without needing to sing themselves. More practically, the success of this concept will create jobs for many and unlock a world of creation and innovation that nobody would’ve ever thought of before. By the creation of music that connects people across different cultures, we can effectively create a community that expresses its deepest emotions through a medium that doesn’t require speaking, but only listening.

Our constant suspicion and inability to embrace foreign concepts can lead to precious opportunities passing us by. AI is and always will be an asset to us. Putting AI to use will profit us and future generations immensely and using it in creating Vocaloids might change the music scene in India forever.

The writer is a student at Plaksha University, UG 2026