Have you heard of MusicGen? It's an AI-powered music generator from Meta that is able to convert a text description into 12 seconds of audio. In a world where artistic activities are increasingly influenced by Artificial Intelligence, the music industry is now firmly in its sphere of influence. Just like ChatGPT and other large, language-based AI models generate lyrics, Meta has now released an open-source version of their AI model for generating music.
What is MusicGen?
MusicGen is based on a Transformer model, just like most of the language models in use today. It predicts the next segment of a piece of music in a similar way to how a language model predicts the following china phone number data letters in a sentence. An impressive amount of 20,000 hours of authorized music was used for training, including 10,000 high-quality audio recordings from an internal dataset as well as music data from Shutterstock and Pond5.
A unique feature of MusicGen is its ability to process both text and music cues. The text you enter sets the basic style that the music in the audio file will follow. However, it is important to know that the melody is not perfectly reflected in the output and only serves as a general guideline for generation.
Just try MusicGen
MusicGen can be tested using the Hugging Face API, although generating music may take some time depending on the number of concurrent users. For faster results, you can use the Hugging Face website to set up your own instance of the model. And if you have the necessary skills and equipment, you can even download the code and run it manually.
Try MusicGen
MusicGen by Meta - This is how good AI music can sound (examples) 1
Your description will be used by the MusicGen model to generate 12 seconds of audio. You can also provide a reference audio file from which a complex melody will be created. By using the reference audio, the model will attempt to create music that better matches the user's preferences by following both the description and the melody provided.