Robotics

DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation Haozhe Xie*, Beichen Wen*, Jiarui Zheng, Zhaoxi Chen, Fangzhou Hong, Haiwen Diao, Ziwei Liu S-Lab, Nanyang Technological University TL;DR: DynamicVLA enables open-ended dynamic object manipulation by pairing a compact 0.4B VLM with low-latency Continuous Inference and Latent-aware Action Prediction, evaluated at scale through the new DOM benchmark in both simulation and the real world. Highlights