Skip to content

Playful-RATs/RATs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RATs: Playful Agentic Robot Learning

Project Page  |  Paper  |  arXiv

teaser.mp4


RATs is a multi-agent Code-as-Policy system for lifelong robot skill learning. During free-form play a team of LLM agents invents its own tasks, writes code-as-policy, and distills successful executions into a reusable skill library; at evaluation those skills are reused as planner context — no gradients, no RL, all learning through structured natural-language feedback and code reuse. RATs runs in two stages:

Stage What it does
Play Curiosity-driven skill acquisition: a proposer → planner → policy-writer → verifier → failure-diagnoser loop invents tasks, executes code-as-policy, and grows a skill library from what works.
Evaluation Frozen learned skills are injected as planner context and compared head-to-head with CaP-X baselines, in two complementary settings: in-domain (LIBERO-PRO, MolmoSpaces) and cross-environment transfer (reusing the same skills in Robosuite and on a real Franka Panda).

RATs pipeline overview — click to play the video


Installation

RATs uses uv for dependency management and runs on Python 3.10 with a CUDA-capable GPU. The full guide — cache/quota tuning, every submodule, the MolmoSpaces bridge, and the capx-baseline/ environments — is in docs/setup.md.

# Clone
git clone --branch main --depth 1 https://github.com/Playful-RATs/RATs rats
cd rats

# Root runtime submodules (LIBERO-PRO + MolmoSpaces)
git submodule update --init --depth 1 \
  rats/third_party/LIBERO-PRO \
  rats/third_party/libero_dependencies/robosuite \
  rats/third_party/robosuite \
  rats/third_party/contact_graspnet_pytorch \
  rats/third_party/curobo \
  rats/third_party/sam3

# Root env
uv venv .venv --python 3.10
source .venv/bin/activate
uv sync --frozen --active --extra libero --extra contactgraspnet

# LIBERO prompts for ~/.libero/config.yaml on first import in a fresh shell.
# Pre-create it to keep setup and batch runs non-interactive.
mkdir -p ~/.libero
cat > ~/.libero/config.yaml <<EOF
benchmark_root: $PWD/rats/third_party/LIBERO-PRO/libero/libero
bddl_files: $PWD/rats/third_party/LIBERO-PRO/libero/libero/./bddl_files
init_states: $PWD/rats/third_party/LIBERO-PRO/libero/libero/./init_files
datasets: $PWD/rats/third_party/LIBERO-PRO/libero/datasets
assets: $PWD/rats/third_party/LIBERO-PRO/libero/libero/./assets
EOF

The root .venv runs both LIBERO-PRO and MolmoSpaces. MolmoSpaces additionally needs a conda bridge env and ~tens of GB of assets; CaP-X baselines, Robosuite transfer, and real-world transfer run from capx-baseline/ with their own envs. See docs/setup.md for all of it.

Quick Start

This section assumes the root .venv from Installation is active. Set a model provider and the runtime exports first:

export OPENAI_API_KEY="sk-..."       # and/or GEMINI_API_KEY / OPENROUTER_API_KEY
export PYTHONPATH="$PWD:${PYTHONPATH:-}"
export CAPX_ENV_STACK=libero
export MUJOCO_GL=egl
export PYOPENGL_PLATFORM=egl

1. Play — acquire skills

python scripts/run_rats.py \
  --config env_configs/libero/rats_libero_play.yaml \
  --explore --iterations 50 \
  --output-dir outputs/play_libero

MolmoSpaces play with the launcher-managed bridge is in docs/play.md.

2. Evaluate — in-domain (RATs vs. CaP-X)

The batch driver runs all six LIBERO-PRO suites (× 10 tasks × 5 trials) with skill reuse frozen:

RATS_VERIFIER_STRICT_BENCHMARK=1 python scripts/run_rats_libero_pro_batch.py \
  --seed-skill-library outputs/play_libero/snapshots/iter050/skills.json \
  --extra-rats-flags "--model google/gemini-3.5-flash" \
  --output-dir outputs/rats_libero_pro_iter050seed \
  --gpus 0,1,2,3,4,6,7 --workers 18 --skip-completed

CaP-X baselines, the MolmoSpaces eval, and result summarization are in docs/evaluation.md.

3. Evaluate — cross-environment transfer

Reuse LIBERO-learned skills as planner context in Robosuite and on a real Franka — see docs/cross-environment.md.


Documentation

Guide Contents
Setup Full environment setup: prerequisites, cache/quota tuning, submodules, LIBERO-PRO / MolmoSpaces / CaP-X-baseline runtimes
Play Skill acquisition in LIBERO-PRO and MolmoSpaces with Contact-GraspNet
Evaluation RATs vs. CaP-X in LIBERO-PRO and MolmoSpaces; output artifacts
Cross-Environment Evaluation Transferring learned skills to Robosuite and a real Franka Panda

Citation

@article{rats2026playful,
  title   = {Playful Agentic Robot Learning},
  author  = {Zhang, Junyi and Ge, Jiaxin and Yoo, Hanjun and Fu, Letian and Yang, Zihan and Liu, Yaowei and Saravanan, Raj and Yin, Shaofeng and Yu, Justin and Niu, Dantong and Wang, Zirui and Herzig, Roei and Goldberg, Ken and Bai, Yutong and Chan, David M. and Stoica, Ion and Kanazawa, Angjoo and Lei, Jiahui and Feng, Haiwen and Darrell, Trevor},
  journal = {arXiv preprint arXiv:2606.19419},
  year    = {2026}
}

About

Implementation of paper "Playful Agentic Robot Learning"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors