Image to Video
Loading...
Loading...
Discover what makes Wan 2.5 Preview exceptional
Provide an audio clip, and Wan 2.5 Preview animates a static character image to speak with incredibly realistic and natural expressions and mouth movements. Revolutionizes workflows for narration, dialogue, virtual presenters, and digital humans.

Wan 2.5 Preview supports text, image, and audio inputs for true multimodal creation. Generate video from descriptions or images, or use audio as a groundbreaking starting point. Unprecedented freedom to start creating with any asset you have on hand.

Wan 2.5 Preview pursues cinematic realism with enhanced motion dynamics and stability. Subjects remain highly consistent, avoiding distortion or jitter. It also better interprets complex prompts, including cinematic camera moves like panning, zooming, and focus shifts.

Supports generating videos up to 10 seconds long for more complete narratives. Offers multiple output resolutions (480p, 720p, 1080p) to match platform needs. Choose the perfect clarity for your project on Cuty.ai. For pure motion transfer, see Wan 2.2.

Everything you need to know about Wan 2.5 Preview
Wan 2.5 Preview is Alibaba's next-gen multimodal AI video model. Its key breakthrough is audio-driven video generation, creating realistic 1080p videos of characters speaking with perfectly synchronized lip-sync and natural facial expressions.
Wan 2.5 Preview also features enhanced motion dynamics for more fluid movement, improved contextual understanding for complex prompts, richer visual details in scene composition, and often faster processing times compared to earlier general video models.
Wan 2.2 focuses on motion transfer (animation/replacement) from a reference video. Wan 2.5 Preview focuses on lip-sync and animation driven by a reference audio file. Use 2.2 to make characters dance; use 2.5 to make them talk.
Yes, this is the perfect use case for Wan 2.5 Preview. Provide a static character image and an audio clip of their speech, and the model generates a 1080p video with realistic expressions and accurate lip-syncing.
Upload standard audio clips (e.g., MP3, WAV) containing narration, dialogue, or any human voice. Wan 2.5 Preview uses this audio as the driver to animate the character's facial expressions and mouth movements from your image.
Wan 2.5 Preview supports generating videos up to 10 seconds long, ideal for short-form content, product narrations, and social media. It supports 480p, 720p, and 1080p HD resolutions, all easily accessible on Cuty.ai.
You can try Wan 2.5 Preview's groundbreaking lip-sync feature on Cuty.ai with our free trial credits. For generating longer videos, using 1080p resolution, and other premium features, you can upgrade to one of our subscription plans.