Revolutionary AI System Generates Realistic Digital Avatars from Single Photos

Friday 07 March 2025


The quest for realistic digital avatars has been a longstanding challenge in computer graphics and animation. For years, researchers have been working on developing methods that can generate high-quality, lifelike characters from single images or text descriptions. Recently, a team of scientists from Alibaba Group’s Tongyi Lab made significant progress in this area by introducing Make-A-Character 2, an advanced system for generating animatable 3D characters from single portrait photographs.


The new system builds upon its predecessor, Make-A-Character, which was released last year. While the initial version showed promise, it had limitations when it came to generating realistic facial expressions and animations. The updated Make-A-Character 2 addresses these issues by incorporating several significant improvements in image-based head generation.


One of the key advancements is the use of a neural network-based color correction method to harmonize skin tones between input photos and game engine renders. This ensures that the generated character’s appearance matches the original photo, with accurate representation of facial features, hair, and clothing. The system also employs a hierarchical representation network to capture high-frequency facial structures, resulting in more detailed and realistic facial animations.


Another major improvement is the introduction of adaptive skeleton calibration for accurate and expressive facial animations. This allows the generated character to exhibit a range of emotions and reactions, from subtle expressions to dramatic gestures. To further enhance realism, the system can also generate co-speech facial and gesture actions, enabling real-time conversations with the animated avatar.


The team behind Make-A-Character 2 has also developed a novel method for generating hairstyles directly from portrait images. This involves using a specialized hairstyle classification model powered by convolutional neural networks to map a portrait image to a specific hairstyle within an asset library. The resulting character models are complete, production-ready assets compatible with modern CG pipelines.


When it comes to animation, Make-A-Character 2 can generate holistic full-body motions that are synchronized with speech audio. This is achieved through a non-autoregressive architecture that directly regresses facial animation weights from audio features. The system’s ability to produce realistic lip-sync and gesture animations makes it an excellent tool for various applications, including gaming, virtual reality, and animation.


While Make-A-Character 2 has many impressive capabilities, it’s not without its limitations. For instance, the system may struggle with input photos that do not have a neutral expression, potentially introducing artifacts in facial rig results.


Cite this article: “Revolutionary AI System Generates Realistic Digital Avatars from Single Photos”, The Science Archive, 2025.


Computer Graphics, Animation, Digital Avatars, Neural Networks, Portrait Photographs, Facial Expressions, Animations, Game Engine, Character Models, Convolutional Neural Networks


Reference: Lin Liu, Yutong Wang, Jiahao Chen, Jianfang Li, Tangli Xue, Longlong Li, Jianqiang Ren, Liefeng Bo, “Make-A-Character 2: Animatable 3D Character Generation From a Single Image” (2025).


Leave a Reply