I am a Senior Research Engineer at DiDi Autonomous Driving, working on Embodied AI and Vision-Language-Action (VLA) systems for robotic platforms that support autonomous vehicle operations. Previously, I was a Postdoctoral Fellow at the ETH AI Center, collaborating with Prof. Siyu Tang and Prof. Christian Holz, where I worked on human motion capture from egocentric videos and wearable IMUs. I received my Ph.D. in 2024 from Xiamen University, supervised by Prof. Cheng Wang and Prof. Chenglu Wen, focusing on 3D computer vision, 4D human motion capture, and human–scene interaction reconstruction.
My goal is to enable embodied agents to understand, reason about, and interact with the physical world for real-world robotic and autonomous systems.

News!