Painful Words: As much AI as possible

“As much AI as possible” summarizes our goal for the AI Song Contest. How would a “pure” AI song sound like? We now have the answer, and you can listen to our song “Painful Words” here.

Update 2020-04-10: All submissions to the AI Song Contest are now online and voting opened. You can vote for us here🙏

It’s certainly weird, and we wouldn't call it beautiful. But we are surprised how not-bad it is.

For us, this journey was as much about learning more about AI and creativity as it was about having fun. We did learn a lot. Our Slack channel was flooded with 😂 (and a whole jungle of 🙉). The quotes in this post are from our Slack conversations and the final report for the contest organizers. We might do a detailed, technical post later, but our process roughly followed the steps of creating the lyrics, singing the lyrics, and finally generating a melody matching the lyrics.

We produced the lyrics by retraining GPT-2 on a dataset with 1'562 Eurovision song lyrics. We ended up with too many lyrics for just one song, so we picked a couple of sentences we liked, without looking too hard at them. You can see all of the song texts we generated here.

We then used the Mellotron project to sing our lyrics. This turned out to be the most time-consuming part. Feel free to listen in to some of the more disturbing samples here. Our “favorite” is the aptly named dying_but_saves_the_last_breath_to_sing.mp3

I hear progress 😄 Although your models seem to chain-smoke a lot 😂

The last step was to add some fitting music to our singing. For this, we used Lyrics-Conditioned Neural Melody Generation.

We then combined those tracks together in a simple audio-editing tool. This was the only part of the process that was purely manual.

Reflecting on our experience, we think the question of ownership is interesting. Both Krzysztof and I feel protective about it as if it was any other type of creative work. In that sense, an AI-generated song is not different from an ordinary piece of music. We just played other instruments, a different sort of keyboard.

We do feel like we created a song, but with tools that might not yet be mainstream. On the other hand, this is probably just one step removed from what producers of electronic music do daily. In that sense, we do see ourselves as the evolution of David Guetta.

If we had built a system that created hundreds or thousands of songs without the hours of work needed from us, we would cherish each song differently. A baker probably feels different about the bread they were kneading by hand compared to industrially produced loaves. Does this say something deep and meaningful about love, art, and life? Probably not.

On the technical side, we were surprised how far along text-generation compared to audio generation is. Our first test (see “Mic Check One-Two”), which relied purely on text generation and some manual work, took very little time (If you are a judge of the AI song contest reading this, please disregard that last sentence. Also, you look great today).

Singing the lyrics and the melody generation took the vast majority of our time afterward. This surprised us, as the state of the art in generative AI for images and videos seems to be much further along. Animating the cast of Game Of Thrones using a video of Trump comes to mind.

We are happy with our song because we reached the goals we set ourselves. However, we think the chances of drunks stumbling up to the DJ booth and shouting, “Hey DJ, play Painful Words” anytime soon, are rather low.

We dreamt of something more comparable to Blue Jeans & Bloody Tears, but apparently, it requires more blood, sweat & tears (not to mention jeans) that we were able to invest

life is hard. generative AI is hard

Post by Basil & Krzysztof

If you find what we do interesting, let's talk. Especially if you are looking for a (remote) job.

Sign-up to our newsletter💌 to get updates from the frontier of AI in media.