Zhijian Shu θˆ’ζ™Ίε₯

Yanhong Zeng is a Research Scientist & Engineer at Ant Group, specializing in efficient generative systems. Previously at Shanghai AI Lab, she served as the Lead Core Maintainer of MMagic. Her work bridges the gap between research and production, developing high-quality, controllable, and scalable multi-modal models, with a current focus on world models and streaming video generation.

πŸ’— Hiring: looking for self-motivated interns to work on Generative AI!

Email  /  Google Scholar  /  Twitter  /  Github  /  Linkedin

profile photo

News

  • [2026.02] πŸŽ‰ LiteVGGT is accepted by CVPR 2026!



Selected Publications

LiteVGGT: Boosting Vanilla VGGT via Geometry-aware Cached Token Merging
Zhijian Shu, Zhijian Shu†, Haobo Li, Hao Ouyang, Qiuyu Wang, Ka Leong Cheng, Jiapeng Zhu, Hengyuan Cao, Zhipeng Zhang, Xing Zhu, Yujun Shen, Min Zhang
CVPR, 2026
project page / arXiv / code

Reward Forcing is a new real-time streaming video generation framework with novel memory design and a rewarded distribution matching distillation method for better dynamic generation.




Working Experience

Ant Group

Researcher, 2025.04 ~ present

Shanghai AI Laboratory

Researcher, 2022.07 ~ 2025.03

Microsoft Research Asia (MSRA)

Research Intern, 2018.06 ~ 2021.12

Research Intern, 2016.06 ~ 2017.06

Projects

CCTV Animation Production: "Poems of Timeless Acclaim"

Tech Lead

Designed and delivered an end-to-end AI animation pipeline for a national-scale production. Achieved global impact with broadcast in 10+ languages across 70+ platforms, amassing 100M+ views.
MagicMaker

Product Owner & Tech Lead

MagicMaker is a user-friendly AI platform that enables seamless image generation, editing, and animation. It empowers users to transform their imagination into captivating cinema and animations with ease.
OpenMMLab/MMagic

Lead Core Maintainer

OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic πŸͺ„: Generative-AI (AIGC), easy-to-use APIs, awesome model zoo, diffusion models, for text-to-image generation, image/video restoration/enhancement, etc.

Miscellanea


The website template was adapted from Jon Barron.