ElevenLabs Music, branded as Eleven Music, is a text-to-music model from ElevenLabs that generates studio-grade vocal and instrumental tracks from natural-language prompts. Click the input box below to use similar features on Cuty AI.
Eleven Music takes a natural-language description and turns it into a fully produced track with vocals or as an instrumental. You can specify genre, style, mood, and overall song structure in plain English, and the model returns a finished take ready for review, editing, or download.

Instead of generating an entire song from a single prompt, Eleven Music lets you define each section — intro, verse, chorus, bridge, outro — with its own style, duration, and lyrics. This makes it possible to build arrangements that shift mood mid-song or align tightly with the structure of a video, game scene, or podcast episode.

Once a track is generated, you can edit either the sound or the lyrics of any section, or apply changes across the whole song. This in-place editing flow is designed for iteration: tweak a chorus melody, rewrite a verse, or change the production direction of a section without losing what already worked.

Eleven Music ships with multilingual support for both lyrics and vocal performance, including English, Spanish, German, and Japanese among other languages. The model handles pronunciation, phrasing, and stress patterns for each language so vocals sound native rather than transliterated.

Music Finetunes let you upload five to ten minutes of your own reference audio and produce a finetuned version of the model that captures that sonic signature. The training process completes quickly, and the resulting finetune can then be used to generate new tracks that share the timbre and production style of your reference set.

Beyond your own finetunes, ElevenLabs maintains a library of curated finetunes that capture specific global genres, including Afro House, Reggaeton, Arabic Groove, and 70s Cambodian Rock. These give you a faster path to authentic stylistic output without training a finetune yourself.

Eleven Music exposes a real-time streaming API designed for interactive use cases — adaptive game soundtracks, live experiences, and responsive apps where waiting for a complete file is too slow. ElevenLabs offers both self-serve pay-as-you-go access and Enterprise plans, with output cleared for film, television, podcasts, social media, advertising, and gaming.

Everything you need to know about elevenlabs-music
Eleven Music is a text-to-music AI model from ElevenLabs that generates studio-grade music from natural-language prompts. It launched in August 2025 in collaboration with music industry partners including Merlin and Kobalt.
You describe the song in natural language — genre, style, structure, vocals or instrumental — and the model generates a fully produced track. You can also define each section with its own style, duration, and lyrics, then edit either the audio or the lyrics after generation.
Yes. Generated tracks are cleared for nearly all commercial uses, including film, television, podcasts, social media, advertising, and gaming, thanks to ElevenLabs' partnerships with industry rights holders.
Yes. The model can generate lyrics and vocals in English, Spanish, German, Japanese, and other languages, with handling for pronunciation and phrasing tailored to each language.
Music Finetunes are custom versions of the Eleven Music model trained on your own audio. With five to ten minutes of reference material, you can produce a finetune that captures your sonic signature and use it to generate new tracks in that style.
Yes. The Eleven Music API supports real-time streaming generation for interactive applications, with self-serve pay-as-you-go access and Enterprise plans available. Output is cleared for the same broad range of commercial uses as in the web product.