Genistat Blog

Tales from the frontier of AI and media

“As much AI as possible” summarizes our goal for the AI Song Contest. How would a “pure” AI song sound like? We now have the answer, and you can listen to our song “Painful Words” here.

Update 2020-04-10: All submissions to the AI Song Contest are now online and voting opened. You can vote for us here🙏

It’s certainly weird, and we wouldn't call it beautiful. But we are surprised how not-bad it is.

For us, this journey was as much about learning more about AI and creativity as it was about having fun. We did learn a lot. Our Slack channel was flooded with 😂 (and a whole jungle of 🙉). The quotes in this post are from our Slack conversations and the final report for the contest organizers. We might do a detailed, technical post later, but our process roughly followed the steps of creating the lyrics, singing the lyrics, and finally generating a melody matching the lyrics.

We produced the lyrics by retraining GPT-2 on a dataset with 1'562 Eurovision song lyrics. We ended up with too many lyrics for just one song, so we picked a couple of sentences we liked, without looking too hard at them. You can see all of the song texts we generated here.

We then used the Mellotron project to sing our lyrics. This turned out to be the most time-consuming part. Feel free to listen in to some of the more disturbing samples here. Our “favorite” is the aptly named dying_but_saves_the_last_breath_to_sing.mp3

I hear progress 😄 Although your models seem to chain-smoke a lot 😂

The last step was to add some fitting music to our singing. For this, we used Lyrics-Conditioned Neural Melody Generation.

We then combined those tracks together in a simple audio-editing tool. This was the only part of the process that was purely manual.

Reflecting on our experience, we think the question of ownership is interesting. Both Krzysztof and I feel protective about it as if it was any other type of creative work. In that sense, an AI-generated song is not different from an ordinary piece of music. We just played other instruments, a different sort of keyboard.

We do feel like we created a song, but with tools that might not yet be mainstream. On the other hand, this is probably just one step removed from what producers of electronic music do daily. In that sense, we do see ourselves as the evolution of David Guetta.

If we had built a system that created hundreds or thousands of songs without the hours of work needed from us, we would cherish each song differently. A baker probably feels different about the bread they were kneading by hand compared to industrially produced loaves. Does this say something deep and meaningful about love, art, and life? Probably not.

On the technical side, we were surprised how far along text-generation compared to audio generation is. Our first test (see “Mic Check One-Two”), which relied purely on text generation and some manual work, took very little time (If you are a judge of the AI song contest reading this, please disregard that last sentence. Also, you look great today).

Singing the lyrics and the melody generation took the vast majority of our time afterward. This surprised us, as the state of the art in generative AI for images and videos seems to be much further along. Animating the cast of Game Of Thrones using a video of Trump comes to mind.

We are happy with our song because we reached the goals we set ourselves. However, we think the chances of drunks stumbling up to the DJ booth and shouting, “Hey DJ, play Painful Words” anytime soon, are rather low.

We dreamt of something more comparable to Blue Jeans & Bloody Tears, but apparently, it requires more blood, sweat & tears (not to mention jeans) that we were able to invest

life is hard. generative AI is hard

Post by Basil & Krzysztof

If you find what we do interesting, let's talk. Especially if you are looking for a (remote) job.

Sign-up to our newsletter💌 to get updates from the frontier of AI in media.

Our first test for the Eurovision AI Song Contest turned out surprisingly hot, have a listen

How we did it 💪

Very broadly speaking, a song consists of lyrics and music. Ideally, the two fit well together (If computer science doesn't work out, I see a big future for us in the music business). We decided to start with the lyrics.

We used an existing collection of Eurovision songs and retrained Open AI's GPT-2 model using gpt-2-simple. Our hope was that GPT-2 would be able to pick up on what makes a Eurovision song.

Using this retrained model, we could generate new song texts. Like this one:

======== SAMPLE 1 ======== ....o, ooh... ooh..." Oh, oh... oh....ooh... oh... oh... oh... oh....ooh... oh....... Oh... oh... oh... oh... oh.... . Oh... oh... oh... oh... oh... oh... oh... oh.... . Hey, yeah.... Hey, yeah.... . Oh.... Oh. Oh.... Oh.... . Oh, oh, oh.... Oh oh... oh.... Oh oh.... Oh oh.... . Oh... oh, oh... oh... oh. . Oh! Oh, oh... oh.... Oh oh... oh, oh.... Oh oh.... . Oh... oh oh... oh... oh oh... oh.... Oh.... Oh oh.... . Oh, oh... oh... oh.... Oh oh... oh... oh.... Oh oh.... . Oh... oh oh... oh. Oh oh, oh... oh, oh.... Oh... oh... oh... oh... oh oh.... Oh oh... oh... oh.... . Oh, oh... oh oh... oh Oh... oh oh... oh... oh.... Oh oh... oh... oh... oh... oh.... Oh oh... oh... oh.... Oh oh... oh....... Oh oh oh.... . Oh... oh oh... oh Oh... oh... oh.... Oh oh oh... oh... oh.... Oh, oh oh.... Ah... oh oh.... . Oh... oh oh... oh... oh... oh.... Oh oh oh... oh <|endoftext|> <|startoftext|> It pains me to say... you made my life so much more. . What an impossible dream. I can't imagine what will happen. . So many people that I know. All I could ever imagine. To be like you. . What am I all wrong??. Oh... what am I all wrong?. . What am I all wrong?. Oh... what am I all wrong?. . So many people that I know. All I can ever imagine. To be like you. . What am I all wrong?. Oh... what am I all wrong?."All I can ever imagine (All I can imagine). To be like you. . What am I all wrong?. Oh... what am I all wrong?. . So many people that I know. All I can ever imagine. To be like you. . What am I all wrong?. Ah... what am I all wrong?. . So many people that I know. All I can ever imagine. To be like you <|endoftext|> <|startoftext|> I could stay or go. I could go any time I wanted. I could always get away. I could get out of my head. I could never ever stop thinking. . I could feel the time is right. I could feel the heart is still blue. The love that I think I need. It makes me sad. . I could feel the time is right. I could feel the heart is still blue. The love that I think I need. It makes me sad. . A million stars, a million days. I can only pray that someday. I can go any time I want. I can always get away. I could never ever stop thinking. . I could feel the time is right. I could feel the heart is still blue. The love that I think I need. It makes me sad. . I could feel the time is right. I could feel the heart is still blue. The love that I think I need. It makes me sad. . Oh my heart beat and shakes. It's like a million angels. That I pray to keep. . Oh my heart is beating with love and hope. It's like a million angels. That I pray to keep. . Oh my heart beats and shakes. It's like a million angels. That I pray to keep. . Oh my heart beats and shakes. It's like a million angels. That I pray to keep. . It's that very same feeling. That the stars really are blue. And I pray to keep. . I could feel the time is right. I could feel the heart is still blue. The love that I think I need. It makes me sad. . I could feel the time is right. I could feel the heart is still blue. The love that I think I need. It makes me sad. . I could feel the time is right. I could feel the heart is still blue. The love that I think I need. It makes me sad <|endoftext|> <|startoftext|> You're taking me to another dimension. Every time you touch me. I feel what I need. The gravity is real. . I knew we were nothing. All that we could be. But I knew all that we could be. . But I knew we couldn't make anything else go away. It feels so good to see you. . I can't help but smile, I gotta see how that is. And I know that

As you can see, it works well. Sort of. There is a chorus that repeats, and there have definitely been more nonsensical texts in the history of music (Yes, I'm looking at you, Scooter. I still love you, though). And nobody cares about your punctuation when you storm the charts.

Let's do a test. Guess which sentence is from our model, and which is from ABBA:

  1. “I've been down, down too many times. But I'm still alive, I'm still alive.”
  2. “So go away, God bless you. You are still my love and my life.”

Sentence 2 is from ABBA's “My Love, My Life”. But I could imagine them singing “I've been down, down too many times. But I'm still alive, I'm still alive.” in those glorious 70s pants. And then have Cher cover it 😍

To have a finished song, we selected a couple of sentences that we liked out of the generated text. We obviously needed somebody to sing the lyrics. Enter the stage, the macOS speak command, Microsoft Sam's sexy cousin.

With our digital front-person selected, we chose a heavy beat to compensate for the lack of content (I truly think music production could be our thing).

We played the beat and had speak say our lyrics. Music 🎶

The way forward 🚀

There are still many interesting issues we need to tackle for our final submission. We want to have as much of the song be AI-generated, so we are now focusing our attention on generating the music. Once we can do that, we will need to make it fit the words and vice versa. Ideally, we also find a way to sing the lyrics, instead of just speaking them out (picking up phone “Oh hi Grimes, yes, I'm available to produce your next album.”)

Post by Basil

If you find what we do interesting, let's talk. Especially if you are looking for a (remote) job.

Sign-up to our newsletter💌 to get updates from the frontier of AI in media.

Enter your email to subscribe to updates.