Tencent has introduced InteractAvatar, an AI avatar system that goes beyond lip-syncing to let digital humans pick up, move, and interact with objects in a scene from a simple text prompt. In demos, characters can put on headphones, check a smartphone, lift a bag, or gently touch a plush toy while speaking, with natural hand and body motion.
The system understands complex multi-step instructions and timing: creators can chain actions with explicit timestamps, such as touching an apple from 0–4 seconds, moving it between 12–16 seconds, then picking it up at 16–20 seconds. It also handles a wide range of gestures like OK signs, thumbs up, arm crossing, heart shapes, clapping and tracks hand poses accurately.
Compared to other animation tools like OmniAvatar, InteractAvatar is currently the only one shown manipulating scene objects instead of just animating a talking head or idle body. Under the hood it builds on the 1.2.2 base video model, with a public project page and GitHub repository that includes local setup instructions.
For creators and brands, this opens the door to more interactive spokesperson videos and product demos, without having to keyframe every hand movement.
Comments
No comments yet. Be the first to share your thoughts!