r/LocalLLaMA · · 1 min read

zai-org/SCAIL-2 · Hugging Face

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

zai-org/SCAIL-2 · Hugging Face

SCAIL-2: Unifying Controlled Character Animation with End-to-end In-Context Conditioning

SCAIL-2 is an open-source model for end-to-end controlled character animation. It animates a reference character with a driving video, and also supports character replacement and multi-character scenarios without relying on intermediate pose representations.

Overview

Prior approaches to character animation depend heavily on intermediate representations such as skeleton maps or inpainting masks. These intermediates are ambiguous under complex motion, restrict driving sources to human movements, and limit the reach of replacement and multi-character animation.

SCAIL-2 removes this dependence and achieve End-to-end Driving. Using several off-the-shelf models (SCAIL-Preview, Wan-Animate, MoCha), 60K motion pairs were synthesized and trained through a Unified Motion Transfer Interface with dedicated masking channels and RoPE design. The reverse driving training recipe with the unification lets the model learn capabilities beyond its teacher models, yielding emergent abilities such as:

  • Cross-identity character replacement
  • Animal-driving scenarios
  • Zero-shot support for advanced control intermediates like SAM3D-Body mesh rendering
submitted by /u/pmttyji
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA