Skip to main content

Google’s MusicLM is rather good at creating music from text descriptions

MusicLM is Google’s latest generative AI, and it can turn text descriptions of varying complexity into high-fidelity music.

MusicLM casts the process of conditional music generation as a hierarchical sequence-to-sequence modeling task, and it generates music at 24 kHz that remains consistent over several minutes.

Text-to-music models are not new, but Google says (via TechCrunch) “MusicLM outperforms previous systems both in audio quality and adherence to the text description.” The rich caption examples below generated 30-second audio pieces:

  • “The main soundtrack of an arcade game. It is fast-paced and upbeat, with a catchy electric guitar riff. The music is repetitive and easy to remember, but with unexpected sounds, like cymbal crashes or drum rolls.”
  • “Epic soundtrack using orchestral instruments. The piece builds tension, creates a sense of urgency. An a cappella chorus sing in unison, it creates a sense of power and strength.”
  • “This is an r&b/hip-hop music piece. There is a male vocal rapping and a female vocal singing in a rap-like manner. The beat is comprised of a piano playing the chords of the tune with an electronic drum backing. The atmosphere of the piece is playful and energetic. This piece could be used in the soundtrack of a high school drama movie/TV show. It could also be played at birthday parties or beach parties.”

One particular fun demo is taking a description of a painting and setting it loose:

There’s then a long generation for five minutes for “melodic techno” (below) and “swing”:

MusicLM is capable of generating various genres and even replicating “musician experience level” (e.g., beginner, intermediate professional). Going forward, Google might explore generating lyrics, improving vocal quality, and higher sample rates. 

Google has “no plans to release models at this point,” citing the need for more work. More generated music examples can be found here. It joins the company’s work on text-to-image and text-to-video.

FTC: We use income earning auto affiliate links. More.

You’re reading 9to5Google — experts who break news about Google and its surrounding ecosystem, day after day. Be sure to check out our homepage for all the latest news, and follow 9to5Google on Twitter, Facebook, and LinkedIn to stay in the loop. Don’t know where to start? Check out our exclusive stories, reviews, how-tos, and subscribe to our YouTube channel

Comments

Author

Avatar for Abner Li Abner Li

Editor-in-chief. Interested in the minutiae of Google and Alphabet. Tips/talk: abner@9to5g.com