This is the official repository for the paper:
VLA-R1: Enhancing Reasoning in Vision-Language-Action Models
Angen Ye*, Zeyu Zhang*, Boyuan Wang, Xiaofeng Wang, Dapeng Zhang, and Zheng Zhu†
*Equal contribution. †Corresponding author.
ECCV 2026
teaser.mp4
conda env create -f environment.yml
conda activate vla_r1
huggingface-cli download --repo-type model --resume-download GigaAI-Research/vla-r1
huggingface-cli download --repo-type dataset --resume-download GigaAI-Research/vla_r1
bash RFT_training/train_utils/run_vla_r1_3b.sh
python scripts/server.py
python scripts/inference.py
If you find our code or paper helpful, please consider starring ⭐ us and citing:
@article{ye2025vlar1,
title={VLA-R1: Enhancing Reasoning in Vision-Language-Action Models},
author={Ye, Angen and Zhang, Zeyu and Wang, Boyuan and Wang, Xiaofeng and Zhang, Dapeng and Zhu, Zheng},
journal={arXiv preprint arXiv:2510.01623},
year={2025}
}