Kling’s 3DiMo is a system for “3D-aware implicit motion control” that transfers motion from a reference video onto a new character. It achieves this while giving users camera control over the generated scene.
In demos, a source dance clip is mapped onto different humans, anime characters, or 3D avatars. Meanwhile, there’s also a virtual camera that orbits, zooms, or tracks from above.
Unlike existing tools like One-Animate or SCALE that focus only on motion transfer, 3DiMo simultaneously handles motion and camera view. The generated videos maintain consistent character shape and environment even as the viewpoint changes.
This suggests the model builds an implicit 3D understanding of the scene. Examples show detailed hand and finger movements preserved, along with facial expressions, and even creative touches.
This includes the likes of adding sakura branches to an upward pan in an anime scene. For now, only a technical paper and demos are available, with no indication that code or weights will be released.
If it eventually ships as a tool or API, 3DiMo could become a powerful option for music videos, VTubers, and virtual production pipelines.
Comments
No comments yet. Be the first to share your thoughts!