GR00T whole-body VLA: train SONIC controller

Disclosure: Bài viết có thể chứa affiliate/referral links. Nếu bạn mua hoặc đăng ký qua các link đó, VnRobo có thể nhận commission hoặc credit.

Ba phần trước tập trung vào dữ liệu GR00T-LeRobot cho VLA. Nhưng trong repo NVlabs/GR00T-WholeBodyControl, còn một lớp rất quan trọng: SONIC whole-body controller. Với workflow G1/GEAR-SONIC, VLA không trực tiếp điều khiển từng motor. VLA thường xuất action/latent ở tầng cao hơn, còn SONIC/C++ deployment stack biến nó thành chuyển động toàn thân ổn định.

Bài này nói về pipeline dữ liệu và training của SONIC:

Bones-SEED / robot motion / SMPL / SOMA
  -> convert motion data
  -> filter impossible motions
  -> PPO training in Isaac Lab
  -> evaluate tracking metrics
  -> export ONNX
  -> C++ deploy stack

4.1 Mục tiêu

Sau bài này, bạn cần làm được:

Chuẩn bị motion-reference data cho SONIC.
Chạy debug training local.
Hiểu khi nào cần 8 GPU, 64+ GPU, hoặc chỉ dùng checkpoint có sẵn.
Đánh giá checkpoint bằng tracking metrics.
Export ONNX cho C++ deployment.
Biết lỗi thường gặp: Git LFS, Isaac Lab env, body name mismatch, motion file path, TensorRT.

Điểm cần phân biệt:

Lớp	Dữ liệu	Mục tiêu
GR00T VLA	GR00T-LeRobot: image/state/action/language	Học policy từ observation + instruction.
SONIC WBC	motion_lib PKL, SMPL, SOMA, G1 trajectories	Học controller/motion foundation cho whole-body movement.
C++ deploy	ONNX/TensorRT + ZMQ + robot state	Chạy realtime trên sim/robot thật.

Khi nào cần tự train SONIC?

Không phải dự án nào cũng cần tự train SONIC. Với task manipulation trên Unitree G1 theo workflow chính thức, đường đi nhanh hơn thường là:

data real/sim/public LeRobot
  -> fine-tune GR00T VLA với UNITREE_G1_SONIC
  -> dùng SONIC checkpoint/deploy stack đã release
  -> infer bằng PolicyServer + launch_inference.py

Bạn chỉ nên tự train/fine-tune SONIC khi:

Cần controller cho embodiment khác G1 hoặc phần cứng đã đổi joint/body layout.
Cần motion skill mới mà checkpoint release không cover đủ.
Muốn nghiên cứu controller foundation, motion tracking, hoặc SOMA/SMPL encoder.
Cần export ONNX/TensorRT riêng để tích hợp C++ deploy nội bộ.

Nếu mục tiêu chính là train VLA từ data sim/real, hãy hoàn tất Phần 1-3 trước. Phần 4 là lớp controller thấp hơn, chi phí compute lớn hơn nhiều.

4.2 Môi trường và phần cứng

Theo docs training của GR00T-WholeBodyControl, SONIC training dùng Isaac Lab và PPO với số môi trường song song lớn. Repo khuyến nghị 64+ GPUs để có convergence time hợp lý; single-node 8 GPU chạy được nhưng chậm đáng kể. Điều này không có nghĩa bạn cần 64 GPU để học pipeline. Để debug:

Mục tiêu	GPU/VRAM thực tế
Load env + replay motion	1 GPU 16-24 GB
Debug training 100 iterations	1 GPU 24-48 GB
Single-node training	8 GPU, càng nhiều VRAM càng tốt
Convergence nghiêm túc	64+ GPU theo docs
Export ONNX	1 GPU hoặc CPU tùy checkpoint/env

Setup repo:

sudo apt update
sudo apt install -y git git-lfs
git lfs install

git clone https://github.com/NVlabs/GR00T-WholeBodyControl.git
cd GR00T-WholeBodyControl
git lfs pull

python check_environment.py

Cài training dependency:

# Isaac Lab phải được cài riêng theo docs chính thức.
# Sau đó cài SONIC training package.
pip install -e "gear_sonic/[training]"

Tải checkpoint/sample data:

pip install huggingface_hub
python download_from_hf.py --training

Cấu trúc thư mục kỳ vọng:

GR00T-WholeBodyControl/
├── gear_sonic/
├── gear_sonic_deploy/
├── data/
│   ├── motion_lib_bones_seed/
│   │   ├── robot/
│   │   └── robot_filtered/
│   └── smpl_filtered/
├── sample_data/
│   ├── robot_filtered/
│   └── smpl_filtered/
└── logs_rl/

4.3 Chuẩn bị motion data

Convert Bones-SEED CSV sang motion_lib

BONES-SEED là dataset motion lớn, đã có G1 MuJoCo trajectories theo README/docs. SONIC cần motion data ở motion_lib PKL format.

python gear_sonic/data_process/convert_soma_csv_to_motion_lib.py \
  --input /path/to/bones_seed/g1/csv/ \
  --output data/motion_lib_bones_seed/robot \
  --fps 30 \
  --fps_source 120 \
  --individual \
  --num_workers 16

Filter motion không phù hợp G1

Docs nói filter loại các motion G1 không thực hiện được như tương tác furniture, vehicles, acrobatics, elevated surfaces. Theo docs, filter này loại khoảng 8.7% motion, còn lại khoảng 130K từ 142K.

python gear_sonic/data_process/filter_and_copy_bones_data.py \
  --source data/motion_lib_bones_seed/robot \
  --dest data/motion_lib_bones_seed/robot_filtered \
  --workers 16

Dry-run trước khi xóa/copy:

python gear_sonic/data_process/filter_and_copy_bones_data.py \
  --source data/motion_lib_bones_seed/robot \
  --dest /tmp/robot_filtered_preview \
  --workers 16 \
  --dry-run

Thêm keyword filter tùy domain:

python gear_sonic/data_process/filter_and_copy_bones_data.py \
  --source data/motion_lib_bones_seed/robot \
  --dest data/motion_lib_bones_seed/robot_filtered \
  --workers 16 \
  --add-keywords "jump_from_table" "cartwheel"

SOMA encoder nếu dùng `sonic_bones_seed`

Config mặc định sonic_release dùng G1, teleop, SMPL. Config sonic_bones_seed thêm SOMA encoder. Nếu cần SOMA:

python gear_sonic/data_process/extract_soma_joints_from_bvh.py \
  --input /path/to/bones_seed/bvh/ \
  --output data/motion_lib_bones_seed/soma \
  --fps 30 \
  --num_workers 16 \
  --skip_existing

python gear_sonic/data_process/filter_and_copy_bones_data.py \
  --source data/motion_lib_bones_seed/soma \
  --dest data/motion_lib_bones_seed/soma_filtered \
  --workers 16

Data layout 4-encoder:

data/
├── motion_lib_bones_seed/
│   ├── robot_filtered/
│   └── soma_filtered/
└── smpl_filtered/

4.4 Train / finetune SONIC

Debug run với sample data

Chạy nhỏ để xác nhận env và config:

python gear_sonic/train_agent_trl.py \
  +exp=manager/universal_token/all_modes/sonic_release \
  num_envs=16 \
  headless=True \
  ++manager_env.commands.motion.motion_lib_cfg.motion_file=sample_data/robot_filtered \
  ++manager_env.commands.motion.motion_lib_cfg.smpl_motion_file=sample_data/smpl_filtered \
  ++algo.config.num_learning_iterations=100

Nếu muốn nhìn render:

python gear_sonic/train_agent_trl.py \
  +exp=manager/universal_token/all_modes/sonic_release \
  num_envs=16 \
  headless=False \
  ++algo.config.num_learning_iterations=100

Train full hoặc finetune từ checkpoint

Train từ motion library:

python gear_sonic/train_agent_trl.py \
  +exp=manager/universal_token/all_modes/sonic_release \
  num_envs=4096 \
  headless=True \
  ++manager_env.commands.motion.motion_lib_cfg.motion_file=data/motion_lib_bones_seed/robot_filtered \
  ++manager_env.commands.motion.motion_lib_cfg.smpl_motion_file=data/smpl_filtered

Finetune từ released checkpoint:

python gear_sonic/train_agent_trl.py \
  +exp=manager/universal_token/all_modes/sonic_release \
  +checkpoint=sonic_release/last.pt \
  num_envs=4096 \
  headless=True \
  ++manager_env.commands.motion.motion_lib_cfg.motion_file=data/motion_lib_bones_seed/robot_filtered \
  ++manager_env.commands.motion.motion_lib_cfg.smpl_motion_file=data/smpl_filtered

Single-node 8 GPU:

accelerate launch --num_processes=8 gear_sonic/train_agent_trl.py \
  +exp=manager/universal_token/all_modes/sonic_release \
  num_envs=4096 \
  headless=True

Multi-node 64 process:

accelerate launch \
  --multi_gpu \
  --num_machines=8 \
  --num_processes=64 \
  --machine_rank=$MACHINE_RANK \
  --main_process_ip=$MASTER_ADDR \
  --main_process_port=$MASTER_PORT \
  gear_sonic/train_agent_trl.py \
  +exp=manager/universal_token/all_modes/sonic_release \
  num_envs=4096 \
  headless=True

Offline W&B:

WANDB_MODE=offline python gear_sonic/train_agent_trl.py \
  +exp=manager/universal_token/all_modes/sonic_release \
  num_envs=16 headless=True

Tắt W&B:

python gear_sonic/train_agent_trl.py \
  +exp=manager/universal_token/all_modes/sonic_release \
  use_wandb=false \
  num_envs=16 headless=True

Output mẫu:

logs_rl/TRL_G1_Track/<experiment_name>-<timestamp>/
├── model_step_002000.pt
├── model_step_004000.pt
├── config.yaml
└── wandb/

4.5 Monitoring và evaluation

Metrics cần theo dõi:

Metric	Mức tốt theo docs	Ý nghĩa
`rewards/total`	3.0+	Tổng reward.
`rewards/anchor_pos_err`	< 0.15 m	Lỗi tracking root position.
`rewards/body_pos_err`	< 0.10 m	Lỗi body position.
`throughput/fps`	khoảng 4000+	Tốc độ training.
`success_rate`	> 0.97	Motion không bị early termination.
`mpjpe_l`	< 30 mm	Local per-joint position error.
`mpjpe_g`	< 200 mm	Global per-joint position error.

Replay reference motions để kiểm tra data:

python gear_sonic/train_agent_trl.py \
  +exp=manager/universal_token/all_modes/sonic_release \
  ++replay=True \
  num_envs=4 \
  headless=False

Evaluate checkpoint bằng metrics:

python gear_sonic/eval_agent_trl.py \
  +checkpoint=/path/to/model_step_100000.pt \
  +headless=True \
  ++eval_callbacks=im_eval \
  ++run_eval_loop=False \
  ++num_envs=128 \
  "+manager_env/terminations=tracking/eval" \
  "++manager_env.commands.motion.motion_lib_cfg.max_unique_motions=512"

Render video:

python gear_sonic/eval_agent_trl.py \
  +checkpoint=/path/to/model_step_100000.pt \
  +headless=True \
  ++eval_callbacks=im_eval \
  ++run_eval_loop=False \
  ++num_envs=8 \
  ++manager_env.config.render_results=True \
  "++manager_env.config.save_rendering_dir=/tmp/renders" \
  ++manager_env.config.env_spacing=10.0 \
  "~manager_env/recorders=empty" "+manager_env/recorders=render"

Nếu evaluate released checkpoint có embedded internal paths, docs yêu cầu override motion path:

"++manager_env.commands.motion.motion_lib_cfg.motion_file=data/motion_lib_bones_seed/robot_filtered"

4.6 Export ONNX cho C++ deploy

Export:

python gear_sonic/eval_agent_trl.py \
  +checkpoint=/path/to/model_step_100000.pt \
  +headless=True \
  ++num_envs=1 \
  +export_onnx_only=true

Output trong exported/ cạnh checkpoint:

exported/
├── *_smpl.onnx
├── *_g1.onnx
├── *_teleop.onnx
├── *_encoder.onnx
└── *_decoder.onnx

Chọn ONNX theo input modality:

ONNX	Dùng khi
`*_smpl.onnx`	Input SMPL pose estimation.
`*_g1.onnx`	Input robot joint/reference.
`*_teleop.onnx`	Input VR teleop tracking.
`_encoder.onnx` + `_decoder.onnx`	Tích hợp pipeline riêng.

Sau export, C++ deployment có thể cần TensorRT engine build. Nếu lỗi TensorRT, kiểm tra TensorRT_ROOT, version CUDA, và docs deployment.

Lỗi thường gặp và cách fix

Lỗi	Nguyên nhân	Fix
Mesh/checkpoint là file text rất nhỏ	Chưa dùng Git LFS	`git lfs install && git lfs pull`.
`ModuleNotFoundError: isaaclab`	Chưa chạy trong Isaac Lab env	Cài Isaac Lab đúng version, activate env trước training.
Motion file path empty	Sai path `motion_file`/`smpl_motion_file`	In `ls`, dùng absolute path khi debug.
Body name mismatch	Robot USD/body names khác config	Kiểm tra docs coordinate/body compatibility, sửa config robot.
Robot explode/fall frame đầu	Motion scale/order/quaternion sai	Replay motion trước train, kiểm tra joint order và frame convention.
Training quá chậm	Ít env/GPU hoặc render bật	Dùng headless, tăng `num_envs`, multi-GPU.
Eval released checkpoint lỗi path	Config checkpoint chứa internal path	Override motion_file như docs.
TensorRT build fail	Sai `TensorRT_ROOT`/version	Set env, dùng version tương thích deployment docs.

Tiêu chí "đã làm đúng"

Bạn đã làm đúng nếu:

check_environment.py pass.
git lfs pull tải asset thật.
Motion data convert và filter xong.
Debug training 100 iterations chạy không NaN.
Checkpoint được ghi vào logs_rl/TRL_G1_Track/....
Eval metrics chạy được.
Render video không bị robot nổ/fall ngay.
ONNX export tạo đủ file trong exported/.

Nguồn

GR00T whole-body VLA: train SONIC controller

Disclosure: Bài viết có thể chứa affiliate/referral links. Nếu bạn mua hoặc đăng ký qua các link đó, VnRobo có thể nhận commission hoặc credit.

Bài này nói về pipeline dữ liệu và training của SONIC:

Bones-SEED / robot motion / SMPL / SOMA
  -> convert motion data
  -> filter impossible motions
  -> PPO training in Isaac Lab
  -> evaluate tracking metrics
  -> export ONNX
  -> C++ deploy stack

4.1 Mục tiêu

Sau bài này, bạn cần làm được:

Chuẩn bị motion-reference data cho SONIC.
Chạy debug training local.
Hiểu khi nào cần 8 GPU, 64+ GPU, hoặc chỉ dùng checkpoint có sẵn.
Đánh giá checkpoint bằng tracking metrics.
Export ONNX cho C++ deployment.
Biết lỗi thường gặp: Git LFS, Isaac Lab env, body name mismatch, motion file path, TensorRT.

Điểm cần phân biệt:

Lớp	Dữ liệu	Mục tiêu
GR00T VLA	GR00T-LeRobot: image/state/action/language	Học policy từ observation + instruction.
SONIC WBC	motion_lib PKL, SMPL, SOMA, G1 trajectories	Học controller/motion foundation cho whole-body movement.
C++ deploy	ONNX/TensorRT + ZMQ + robot state	Chạy realtime trên sim/robot thật.

Khi nào cần tự train SONIC?

Không phải dự án nào cũng cần tự train SONIC. Với task manipulation trên Unitree G1 theo workflow chính thức, đường đi nhanh hơn thường là:

data real/sim/public LeRobot
  -> fine-tune GR00T VLA với UNITREE_G1_SONIC
  -> dùng SONIC checkpoint/deploy stack đã release
  -> infer bằng PolicyServer + launch_inference.py

Bạn chỉ nên tự train/fine-tune SONIC khi:

Cần controller cho embodiment khác G1 hoặc phần cứng đã đổi joint/body layout.
Cần motion skill mới mà checkpoint release không cover đủ.
Muốn nghiên cứu controller foundation, motion tracking, hoặc SOMA/SMPL encoder.
Cần export ONNX/TensorRT riêng để tích hợp C++ deploy nội bộ.

Nếu mục tiêu chính là train VLA từ data sim/real, hãy hoàn tất Phần 1-3 trước. Phần 4 là lớp controller thấp hơn, chi phí compute lớn hơn nhiều.

4.2 Môi trường và phần cứng

Mục tiêu	GPU/VRAM thực tế
Load env + replay motion	1 GPU 16-24 GB
Debug training 100 iterations	1 GPU 24-48 GB
Single-node training	8 GPU, càng nhiều VRAM càng tốt
Convergence nghiêm túc	64+ GPU theo docs
Export ONNX	1 GPU hoặc CPU tùy checkpoint/env

Setup repo:

sudo apt update
sudo apt install -y git git-lfs
git lfs install

git clone https://github.com/NVlabs/GR00T-WholeBodyControl.git
cd GR00T-WholeBodyControl
git lfs pull

python check_environment.py

Cài training dependency:

# Isaac Lab phải được cài riêng theo docs chính thức.
# Sau đó cài SONIC training package.
pip install -e "gear_sonic/[training]"

Tải checkpoint/sample data:

pip install huggingface_hub
python download_from_hf.py --training

Cấu trúc thư mục kỳ vọng:

GR00T-WholeBodyControl/
├── gear_sonic/
├── gear_sonic_deploy/
├── data/
│   ├── motion_lib_bones_seed/
│   │   ├── robot/
│   │   └── robot_filtered/
│   └── smpl_filtered/
├── sample_data/
│   ├── robot_filtered/
│   └── smpl_filtered/
└── logs_rl/

4.3 Chuẩn bị motion data

Convert Bones-SEED CSV sang motion_lib

BONES-SEED là dataset motion lớn, đã có G1 MuJoCo trajectories theo README/docs. SONIC cần motion data ở motion_lib PKL format.

python gear_sonic/data_process/convert_soma_csv_to_motion_lib.py \
  --input /path/to/bones_seed/g1/csv/ \
  --output data/motion_lib_bones_seed/robot \
  --fps 30 \
  --fps_source 120 \
  --individual \
  --num_workers 16

Filter motion không phù hợp G1

python gear_sonic/data_process/filter_and_copy_bones_data.py \
  --source data/motion_lib_bones_seed/robot \
  --dest data/motion_lib_bones_seed/robot_filtered \
  --workers 16

Dry-run trước khi xóa/copy:

python gear_sonic/data_process/filter_and_copy_bones_data.py \
  --source data/motion_lib_bones_seed/robot \
  --dest /tmp/robot_filtered_preview \
  --workers 16 \
  --dry-run

Thêm keyword filter tùy domain:

python gear_sonic/data_process/filter_and_copy_bones_data.py \
  --source data/motion_lib_bones_seed/robot \
  --dest data/motion_lib_bones_seed/robot_filtered \
  --workers 16 \
  --add-keywords "jump_from_table" "cartwheel"

SOMA encoder nếu dùng `sonic_bones_seed`

Config mặc định sonic_release dùng G1, teleop, SMPL. Config sonic_bones_seed thêm SOMA encoder. Nếu cần SOMA:

python gear_sonic/data_process/extract_soma_joints_from_bvh.py \
  --input /path/to/bones_seed/bvh/ \
  --output data/motion_lib_bones_seed/soma \
  --fps 30 \
  --num_workers 16 \
  --skip_existing

python gear_sonic/data_process/filter_and_copy_bones_data.py \
  --source data/motion_lib_bones_seed/soma \
  --dest data/motion_lib_bones_seed/soma_filtered \
  --workers 16

Data layout 4-encoder:

data/
├── motion_lib_bones_seed/
│   ├── robot_filtered/
│   └── soma_filtered/
└── smpl_filtered/

4.4 Train / finetune SONIC

Debug run với sample data

Chạy nhỏ để xác nhận env và config:

python gear_sonic/train_agent_trl.py \
  +exp=manager/universal_token/all_modes/sonic_release \
  num_envs=16 \
  headless=True \
  ++manager_env.commands.motion.motion_lib_cfg.motion_file=sample_data/robot_filtered \
  ++manager_env.commands.motion.motion_lib_cfg.smpl_motion_file=sample_data/smpl_filtered \
  ++algo.config.num_learning_iterations=100

Nếu muốn nhìn render:

python gear_sonic/train_agent_trl.py \
  +exp=manager/universal_token/all_modes/sonic_release \
  num_envs=16 \
  headless=False \
  ++algo.config.num_learning_iterations=100

Train full hoặc finetune từ checkpoint

Train từ motion library:

python gear_sonic/train_agent_trl.py \
  +exp=manager/universal_token/all_modes/sonic_release \
  num_envs=4096 \
  headless=True \
  ++manager_env.commands.motion.motion_lib_cfg.motion_file=data/motion_lib_bones_seed/robot_filtered \
  ++manager_env.commands.motion.motion_lib_cfg.smpl_motion_file=data/smpl_filtered

Finetune từ released checkpoint:

python gear_sonic/train_agent_trl.py \
  +exp=manager/universal_token/all_modes/sonic_release \
  +checkpoint=sonic_release/last.pt \
  num_envs=4096 \
  headless=True \
  ++manager_env.commands.motion.motion_lib_cfg.motion_file=data/motion_lib_bones_seed/robot_filtered \
  ++manager_env.commands.motion.motion_lib_cfg.smpl_motion_file=data/smpl_filtered

Single-node 8 GPU:

accelerate launch --num_processes=8 gear_sonic/train_agent_trl.py \
  +exp=manager/universal_token/all_modes/sonic_release \
  num_envs=4096 \
  headless=True

Multi-node 64 process:

accelerate launch \
  --multi_gpu \
  --num_machines=8 \
  --num_processes=64 \
  --machine_rank=$MACHINE_RANK \
  --main_process_ip=$MASTER_ADDR \
  --main_process_port=$MASTER_PORT \
  gear_sonic/train_agent_trl.py \
  +exp=manager/universal_token/all_modes/sonic_release \
  num_envs=4096 \
  headless=True

Offline W&B:

WANDB_MODE=offline python gear_sonic/train_agent_trl.py \
  +exp=manager/universal_token/all_modes/sonic_release \
  num_envs=16 headless=True

Tắt W&B:

python gear_sonic/train_agent_trl.py \
  +exp=manager/universal_token/all_modes/sonic_release \
  use_wandb=false \
  num_envs=16 headless=True

Output mẫu:

logs_rl/TRL_G1_Track/<experiment_name>-<timestamp>/
├── model_step_002000.pt
├── model_step_004000.pt
├── config.yaml
└── wandb/

4.5 Monitoring và evaluation

Metrics cần theo dõi:

Metric	Mức tốt theo docs	Ý nghĩa
`rewards/total`	3.0+	Tổng reward.
`rewards/anchor_pos_err`	< 0.15 m	Lỗi tracking root position.
`rewards/body_pos_err`	< 0.10 m	Lỗi body position.
`throughput/fps`	khoảng 4000+	Tốc độ training.
`success_rate`	> 0.97	Motion không bị early termination.
`mpjpe_l`	< 30 mm	Local per-joint position error.
`mpjpe_g`	< 200 mm	Global per-joint position error.

Replay reference motions để kiểm tra data:

python gear_sonic/train_agent_trl.py \
  +exp=manager/universal_token/all_modes/sonic_release \
  ++replay=True \
  num_envs=4 \
  headless=False

Evaluate checkpoint bằng metrics:

python gear_sonic/eval_agent_trl.py \
  +checkpoint=/path/to/model_step_100000.pt \
  +headless=True \
  ++eval_callbacks=im_eval \
  ++run_eval_loop=False \
  ++num_envs=128 \
  "+manager_env/terminations=tracking/eval" \
  "++manager_env.commands.motion.motion_lib_cfg.max_unique_motions=512"

Render video:

python gear_sonic/eval_agent_trl.py \
  +checkpoint=/path/to/model_step_100000.pt \
  +headless=True \
  ++eval_callbacks=im_eval \
  ++run_eval_loop=False \
  ++num_envs=8 \
  ++manager_env.config.render_results=True \
  "++manager_env.config.save_rendering_dir=/tmp/renders" \
  ++manager_env.config.env_spacing=10.0 \
  "~manager_env/recorders=empty" "+manager_env/recorders=render"

Nếu evaluate released checkpoint có embedded internal paths, docs yêu cầu override motion path:

"++manager_env.commands.motion.motion_lib_cfg.motion_file=data/motion_lib_bones_seed/robot_filtered"

4.6 Export ONNX cho C++ deploy

Export:

python gear_sonic/eval_agent_trl.py \
  +checkpoint=/path/to/model_step_100000.pt \
  +headless=True \
  ++num_envs=1 \
  +export_onnx_only=true

Output trong exported/ cạnh checkpoint:

exported/
├── *_smpl.onnx
├── *_g1.onnx
├── *_teleop.onnx
├── *_encoder.onnx
└── *_decoder.onnx

Chọn ONNX theo input modality:

ONNX	Dùng khi
`*_smpl.onnx`	Input SMPL pose estimation.
`*_g1.onnx`	Input robot joint/reference.
`*_teleop.onnx`	Input VR teleop tracking.
`_encoder.onnx` + `_decoder.onnx`	Tích hợp pipeline riêng.

Sau export, C++ deployment có thể cần TensorRT engine build. Nếu lỗi TensorRT, kiểm tra TensorRT_ROOT, version CUDA, và docs deployment.

Lỗi thường gặp và cách fix

Lỗi	Nguyên nhân	Fix
Mesh/checkpoint là file text rất nhỏ	Chưa dùng Git LFS	`git lfs install && git lfs pull`.
`ModuleNotFoundError: isaaclab`	Chưa chạy trong Isaac Lab env	Cài Isaac Lab đúng version, activate env trước training.
Motion file path empty	Sai path `motion_file`/`smpl_motion_file`	In `ls`, dùng absolute path khi debug.
Body name mismatch	Robot USD/body names khác config	Kiểm tra docs coordinate/body compatibility, sửa config robot.
Robot explode/fall frame đầu	Motion scale/order/quaternion sai	Replay motion trước train, kiểm tra joint order và frame convention.
Training quá chậm	Ít env/GPU hoặc render bật	Dùng headless, tăng `num_envs`, multi-GPU.
Eval released checkpoint lỗi path	Config checkpoint chứa internal path	Override motion_file như docs.
TensorRT build fail	Sai `TensorRT_ROOT`/version	Set env, dùng version tương thích deployment docs.

Tiêu chí "đã làm đúng"

Bạn đã làm đúng nếu:

check_environment.py pass.
git lfs pull tải asset thật.
Motion data convert và filter xong.
Debug training 100 iterations chạy không NaN.
Checkpoint được ghi vào logs_rl/TRL_G1_Track/....
Eval metrics chạy được.
Render video không bị robot nổ/fall ngay.
ONNX export tạo đủ file trong exported/.

GR00T whole-body VLA: train SONIC controller

4.1 Mục tiêu

Khi nào cần tự train SONIC?

4.2 Môi trường và phần cứng

4.3 Chuẩn bị motion data

Convert Bones-SEED CSV sang motion_lib

Filter motion không phù hợp G1

SOMA encoder nếu dùng sonic_bones_seed

4.4 Train / finetune SONIC

Debug run với sample data

Train full hoặc finetune từ checkpoint

4.5 Monitoring và evaluation

4.6 Export ONNX cho C++ deploy

Lỗi thường gặp và cách fix

Tiêu chí "đã làm đúng"

Bài viết liên quan

Nguồn

Nguyễn Anh Tuấn

Bài viết liên quan

GR00T whole-body VLA data: dùng open dataset

GR00T whole-body VLA data: có cần data real?

GR00T whole-body VLA data: sinh data sim

GR00T whole-body VLA: train SONIC controller

4.1 Mục tiêu

Khi nào cần tự train SONIC?

4.2 Môi trường và phần cứng

4.3 Chuẩn bị motion data

Convert Bones-SEED CSV sang motion_lib

Filter motion không phù hợp G1

SOMA encoder nếu dùng sonic_bones_seed

4.4 Train / finetune SONIC

Debug run với sample data

Train full hoặc finetune từ checkpoint

4.5 Monitoring và evaluation

4.6 Export ONNX cho C++ deploy

Lỗi thường gặp và cách fix

Tiêu chí "đã làm đúng"

Bài viết liên quan

Nguồn

Nguyễn Anh Tuấn

Bài viết liên quan

GR00T whole-body VLA data: dùng open dataset

GR00T whole-body VLA data: có cần data real?

GR00T whole-body VLA data: sinh data sim

SOMA encoder nếu dùng `sonic_bones_seed`

SOMA encoder nếu dùng `sonic_bones_seed`