Cuty.ai

ElevenLabs Music

ElevenLabs Music, branded as Eleven Music, is a text-to-music model from ElevenLabs that generates studio-grade vocal and instrumental tracks from natural-language prompts. Click the input box below to use similar features on Cuty AI.

Prompt
0/1024

Key Features of ElevenLabs Music

  • Text-to-Music Generation: Generate studio-grade vocal or instrumental tracks from a natural-language prompt describing genre, style, and structure.
  • Section-by-Section Composition: Define independent styles, durations, and lyrics for each section of a song instead of treating it as one block.
  • Sound and Lyric Editing: Edit either the audio or the lyrics of an individual section, or the whole song, without regenerating from scratch.
  • Multilingual Lyrics and Vocals: Generate lyrics and vocals in English, Spanish, German, Japanese, and other supported languages.
  • Music Finetunes on Your Own Audio: Train a custom finetune of the model on five to ten minutes of your own audio for a personal sonic signature.
  • Curated Genre Finetunes: Use ElevenLabs' curated finetunes across niche genres like Afro House, Reggaeton, Arabic Groove, and 70s Cambodian Rock.
  • Real-Time Streaming API: Stream music generation in real time through the Eleven Music API for interactive games, apps, and live experiences.

Text-to-Music Generation

Eleven Music takes a natural-language description and turns it into a fully produced track with vocals or as an instrumental. You can specify genre, style, mood, and overall song structure in plain English, and the model returns a finished take ready for review, editing, or download.

Text-to-Music Generation

Section-by-Section Composition

Instead of generating an entire song from a single prompt, Eleven Music lets you define each section — intro, verse, chorus, bridge, outro — with its own style, duration, and lyrics. This makes it possible to build arrangements that shift mood mid-song or align tightly with the structure of a video, game scene, or podcast episode.

Section-by-Section Composition

Sound and Lyric Editing

Once a track is generated, you can edit either the sound or the lyrics of any section, or apply changes across the whole song. This in-place editing flow is designed for iteration: tweak a chorus melody, rewrite a verse, or change the production direction of a section without losing what already worked.

Sound and Lyric Editing

Multilingual Lyrics and Vocals

Eleven Music ships with multilingual support for both lyrics and vocal performance, including English, Spanish, German, and Japanese among other languages. The model handles pronunciation, phrasing, and stress patterns for each language so vocals sound native rather than transliterated.

Multilingual Lyrics and Vocals

Music Finetunes on Your Own Audio

Music Finetunes let you upload five to ten minutes of your own reference audio and produce a finetuned version of the model that captures that sonic signature. The training process completes quickly, and the resulting finetune can then be used to generate new tracks that share the timbre and production style of your reference set.

Music Finetunes on Your Own Audio

Curated Genre Finetunes

Beyond your own finetunes, ElevenLabs maintains a library of curated finetunes that capture specific global genres, including Afro House, Reggaeton, Arabic Groove, and 70s Cambodian Rock. These give you a faster path to authentic stylistic output without training a finetune yourself.

Curated Genre Finetunes

Real-Time Streaming API

Eleven Music exposes a real-time streaming API designed for interactive use cases — adaptive game soundtracks, live experiences, and responsive apps where waiting for a complete file is too slow. ElevenLabs offers both self-serve pay-as-you-go access and Enterprise plans, with output cleared for film, television, podcasts, social media, advertising, and gaming.

Real-Time Streaming API

Frequently Asked Questions

Everything you need to know about elevenlabs-music

Eleven Music is a text-to-music AI model from ElevenLabs that generates studio-grade music from natural-language prompts. It launched in August 2025 in collaboration with music industry partners including Merlin and Kobalt.

You describe the song in natural language — genre, style, structure, vocals or instrumental — and the model generates a fully produced track. You can also define each section with its own style, duration, and lyrics, then edit either the audio or the lyrics after generation.

Yes. Generated tracks are cleared for nearly all commercial uses, including film, television, podcasts, social media, advertising, and gaming, thanks to ElevenLabs' partnerships with industry rights holders.

Yes. The model can generate lyrics and vocals in English, Spanish, German, Japanese, and other languages, with handling for pronunciation and phrasing tailored to each language.

Music Finetunes are custom versions of the Eleven Music model trained on your own audio. With five to ten minutes of reference material, you can produce a finetune that captures your sonic signature and use it to generate new tracks in that style.

Yes. The Eleven Music API supports real-time streaming generation for interactive applications, with self-serve pay-as-you-go access and Enterprise plans available. Output is cleared for the same broad range of commercial uses as in the web product.

Ready to create with elevenlabs-music?

Start generating amazing content with our powerful AI models. Try it free today!