VLA + WBC repos từ Trung Quốc: Unitree, THU RDT-1B, và cộng đồng mở

Đây là bài 3 trong series Bản đồ repos VLA + WBC. Bài này deep dive vào các repos từ Trung Quốc — Unitree Robotics, Thanh Hoa University, RobotEra, và các benchmark tools từ cộng đồng mở.

Điểm khác biệt lớn nhất của nhóm này so với nhóm Mỹ: tập trung vào hardware deployment ngay — vì các công ty này vừa làm phần cứng vừa làm phần mềm. Unitree G1/H1 là platform thực tế mà nhiều lab thế giới đang dùng, nên repos của họ có lượng người dùng thực lớn.

Unitree Robotics: full stack từ sim đến deploy

Unitree có 3 repos quan trọng cho VLA/WBC, giải quyết 3 bài toán khác nhau trong pipeline.

unitree_rl_gym (~3.3k stars)

Repo: unitreerobotics/unitree_rl_gym

Là gì: RL training environment trong Isaac Gym cho tất cả robots Unitree (Go1, Go2, B2, H1, G1). Baseline locomotion policy cho mỗi platform.

Đây là điểm khởi đầu khi bạn muốn:

Train locomotion policy từ đầu cho Unitree robot
Customize gait (tốc độ, terrain, dáng đi)
Transfer từ sim sang real (sim2real)

Pipeline:

Isaac Gym (sim) → PPO training → policy checkpoint
    ↓
Export sang ONNX hoặc TorchScript
    ↓
Deploy trên robot (onboard computer)

Terrain curriculum trong repo:

Flat ground (cơ bản)
Slope, stair, rough terrain
Discrete obstacles

Cách bắt đầu:

git clone https://github.com/unitreerobotics/unitree_rl_gym.git
cd unitree_rl_gym
pip install -r requirements.txt

# Train H1 locomotion
python legged_gym/scripts/train.py \
  --task=h1 --run_name=baseline_h1

# Play (visualize in sim)
python legged_gym/scripts/play.py \
  --task=h1 --run_name=baseline_h1

Lưu ý: Cần Isaac Gym từ NVIDIA (register tài khoản NVIDIA developer để download).

xr_teleoperate (~1.5k stars)

Repo: unitreerobotics/xr_teleoperate

Là gì: Data collection và teleoperation cho Unitree G1/H1 dùng Apple Vision Pro, Meta Quest 3, hoặc Dexterous Gloves. Tương đương TeleVision (Stanford) nhưng tối ưu cho hardware Unitree.

Điểm khác biệt so với TeleVision:

Native SDK integration với Unitree G1/H1 (không cần bridge qua ROS2)
Support Dexterous Gloves → thu được finger motion nếu robot có dexterous hand
Latency thấp hơn vì direct SDK

Supported devices:

Device	Coverage	Cost
Apple Vision Pro	Head + hands (no fingers)	~$3,500
Meta Quest 3	Hands + body (limited)	~$500
Dexterous Gloves (e.g., Manus)	Full finger data	~$5,000+

Workflow thu data:

1. Connect HMD/Gloves → xr_teleoperate daemon
2. Teleop robot thực hiện task (vừa nhìn video stream vừa điều khiển)
3. Data tự động ghi vào format: 
   {joint_positions, end_effector_poses, camera_frames, timestamps}
4. Convert sang LeRobot format cho training

unifolm-vla (~477 stars)

Repo: unitreerobotics/unifolm-vla

Là gì: VLA model cho Unitree humanoid (G1, H1) — pretrain + fine-tune pipeline. Tên "unifolm" = Unified Foundation Model for Loco-Manipulation.

Architecture:

Backbone: InternVL2 (CASIA + Shanghai AI Lab)
Action head: diffusion policy (continuous)
Frequency: ~10Hz (upper body), 200Hz (locomotion)
Control: upper body end-effector + locomotion velocity command

Đây là điểm khác biệt quan trọng: unifolm-vla chỉ output high-level commands — desired wrist poses và velocity command cho locomotion. Controller cấp thấp (chạy ở 200Hz) xử lý joint commands và balance. Architecture giống GR00T-WBC của NVIDIA nhưng built cho Unitree hardware.

Cách dùng:

git clone https://github.com/unitreerobotics/unifolm-vla.git
cd unifolm-vla

# Fine-tune với data từ xr_teleoperate
python train.py \
  --model_name unifolm_base \
  --dataset_path /path/to/lerobot_dataset \
  --robot_type g1 \
  --output_dir ./finetuned

# Deploy (cần Unitree G1 hoặc H1 + onboard compute)
python deploy.py \
  --checkpoint ./finetuned/checkpoint_best.pth \
  --robot_type g1

Lưu ý: Pretrained checkpoint chưa public ở thời điểm viết bài. Repo có code train từ đầu hoặc fine-tune từ InternVL2 checkpoint.

Unitree pipeline hoàn chỉnh

Ba repos của Unitree tạo thành một pipeline khép kín:

[unitree_rl_gym]          → locomotion baseline (sim)
        ↓
[xr_teleoperate]           → thu data manipulation (real)
        ↓
[unifolm-vla]              → VLA fine-tune + deploy

Nếu bạn có Unitree G1 hoặc H1, đây là shortest path từ 0 đến working whole-body VLA policy. Không cần cross-embodiment fine-tuning như khi dùng openpi hay GR00T vốn designed cho nhiều robot.

Thanh Hoa University (THU): RDT-1B (~1.7k stars)

Repo: thu-ml/RoboticsDiffusionTransformer

Là gì: RDT-1B — foundation model cho bimanual manipulation dùng diffusion transformer. Hiện tại là một trong những model bimanual mạnh nhất (benchmark LIBERO, RLBench).

Architecture:

Backbone: DiT (Diffusion Transformer, 1B params)
Inputs: stereo cameras + wrist cameras + language
Action: diffusion denoising → 7-DoF joint actions (x2 arms)
Frequency: ~25Hz

Tại sao khác openpi/GR00T: RDT-1B tập trung hẹp hơn — chỉ bimanual manipulation, không có locomotion. Nhưng trong domain này nó mạnh hơn nhiều foundation models khác, đặc biệt với tasks cần hai tay phối hợp chính xác.

Datasets dùng để train:

Bridge V2, DROID, Open X-Embodiment
LIBERO benchmark suite
Custom bimanual teleoperation data (ALOHA platform)

Cách dùng:

git clone https://github.com/thu-ml/RoboticsDiffusionTransformer.git
cd RoboticsDiffusionTransformer
pip install -r requirements.txt

# Download pretrained checkpoint
python scripts/download_model.py --model rdt-1b

# Inference
python inference.py \
  --model_path checkpoints/rdt-1b \
  --image path/to/obs.jpg \
  --instruction "fold the cloth in half"

Fine-tuning: repo có script fine-tune với custom data. Format: HDF5 hoặc RLDS.

Paper: RDT-1B: A Diffusion Foundation Model for Bimanual Manipulation (2024)

RobotEra: humanoid-gym (~2k stars)

Repo: roboterax/humanoid-gym

Là gì: RL training framework cho humanoid (dựa trên legged_gym, similar to unitree_rl_gym) nhưng focused hơn vào sim2real transfer với XBot-L robot của RobotEra.

Điểm khác biệt:

XBot-L specific (không support nhiều robots như unitree_rl_gym)
Focus mạnh hơn vào sim2real gap: domain randomization, actuator modeling
Có detailed tutorial về tuning cho real robot

Dùng tốt khi: bạn muốn học về sim2real transfer techniques hoặc dùng XBot-L.

Benchmark: humanoid-bench (~772 stars)

Repo: carlosferrazza/humanoid-bench

Là gì: Benchmark suite cho humanoid robot tasks — tiêu chuẩn để evaluate performance và so sánh các methods.

Tasks trong benchmark:

Stand, walk, run (locomotion only)
Reach, grasp, place (manipulation only)
Walk-then-grasp (loco-manipulation)
Door open, drawer open (long-horizon)

Humanoid models có sẵn: Unitree H1, G1, Agility Digit, custom models.

Dùng để làm gì:

git clone https://github.com/carlosferrazza/humanoid-bench.git
cd humanoid-bench
pip install -e .

# Evaluate policy
python evaluate.py \
  --task "walk_and_grasp" \
  --policy path/to/your/policy \
  --num_episodes 100

Quan trọng: nếu bạn train một policy mới, chạy humanoid-bench để có số liệu benchmark so sánh với các papers. Rất khó publish nếu không có benchmark numbers.

Paper không có repo: LeVERB và WoCoCo

Một số công trình quan trọng hiện chưa có public repo:

LeVERB

Gì: Language-conditioned Whole-Body Control from Videos
Khi nào: Paper tháng 4/2026
Approach: Học WBC policy từ internet video (không cần robot demo)
Status: Chưa có public code — chờ

WoCoCo

Gì: Learning Whole-Body Humanoid Control with Sequential Contacts (CoRL 2024)
Nơi: Peking University
Approach: RL với contact sequence planning
Status: Chưa có public repo

ExBody2 (Tsinghua)

Gì: Extended body imitation learning (full body + expression)
Status: Paper + code đang chuẩn bị release

So sánh tổng hợp — nhóm Trung Quốc + open benchmarks

Repo	Bài toán	Barrier	Ghi chú
unitree_rl_gym	Locomotion sim	Thấp	Cần Isaac Gym
xr_teleoperate	Data collection	Trung bình	Cần HMD hoặc gloves
unifolm-vla	VLA cho Unitree	Cao	Cần Unitree G1/H1
RDT-1B	Bimanual VLA	Thấp	Không cần humanoid
humanoid-gym	Sim training	Thấp	XBot-L focused
humanoid-bench	Evaluation	Thấp	Không cần real robot

Nhận xét so sánh nhóm Mỹ vs Trung Quốc

Nhóm Mỹ (NVIDIA, Physical Intelligence, Berkeley) mạnh về:

Foundation model quality và generalization
Research novelty (EgoHumanoid, HumanPlus)
Open culture (nhiều paper + code + dataset cùng lúc)

Nhóm Trung Quốc (Unitree, THU) mạnh về:

Hardware-software co-design (Unitree làm cả robot lẫn software)
Cost-effective hardware (G1 rẻ hơn 5-10x so với GR1, Figure)
Large-scale deployment (Unitree ship nhiều robot hơn bất kỳ ai)

Thực tế: nhiều lab Mỹ đang dùng Unitree G1/H1 làm platform và fine-tune openpi hoặc GR00T lên đó — kết hợp hardware Trung Quốc + software Mỹ.

Tổng kết series

Từ 3 bài, bạn có bức tranh toàn cảnh:

Bài 1: Taxonomy, cách chọn điểm bắt đầu theo hardware
Bài 2: Nhóm Mỹ — foundation models mạnh, research-first
Bài này: Nhóm Trung Quốc — deployment-first, hardware-integrated stack

Recommendation cuối: nếu bạn muốn bắt đầu ngay với budget thực tế:

Hardware: Unitree G1 (cheapest full humanoid)
Data collection: xr_teleoperate với Meta Quest 3
VLA: fine-tune openpi (best generalization) hoặc unifolm-vla (Unitree-native)
WBC: GR00T-WholeBodyControl hoặc unitree_rl_gym locomotion + custom arm control

Nguồn tham khảo

VLA + WBC repos từ Trung Quốc: Unitree, THU RDT-1B, và cộng đồng mở

Unitree Robotics: full stack từ sim đến deploy

Unitree có 3 repos quan trọng cho VLA/WBC, giải quyết 3 bài toán khác nhau trong pipeline.

unitree_rl_gym (~3.3k stars)

Repo: unitreerobotics/unitree_rl_gym

Là gì: RL training environment trong Isaac Gym cho tất cả robots Unitree (Go1, Go2, B2, H1, G1). Baseline locomotion policy cho mỗi platform.

Đây là điểm khởi đầu khi bạn muốn:

Train locomotion policy từ đầu cho Unitree robot
Customize gait (tốc độ, terrain, dáng đi)
Transfer từ sim sang real (sim2real)

Pipeline:

Isaac Gym (sim) → PPO training → policy checkpoint
    ↓
Export sang ONNX hoặc TorchScript
    ↓
Deploy trên robot (onboard computer)

Terrain curriculum trong repo:

Flat ground (cơ bản)
Slope, stair, rough terrain
Discrete obstacles

Cách bắt đầu:

git clone https://github.com/unitreerobotics/unitree_rl_gym.git
cd unitree_rl_gym
pip install -r requirements.txt

# Train H1 locomotion
python legged_gym/scripts/train.py \
  --task=h1 --run_name=baseline_h1

# Play (visualize in sim)
python legged_gym/scripts/play.py \
  --task=h1 --run_name=baseline_h1

Lưu ý: Cần Isaac Gym từ NVIDIA (register tài khoản NVIDIA developer để download).

xr_teleoperate (~1.5k stars)

Repo: unitreerobotics/xr_teleoperate

Điểm khác biệt so với TeleVision:

Native SDK integration với Unitree G1/H1 (không cần bridge qua ROS2)
Support Dexterous Gloves → thu được finger motion nếu robot có dexterous hand
Latency thấp hơn vì direct SDK

Supported devices:

Device	Coverage	Cost
Apple Vision Pro	Head + hands (no fingers)	~$3,500
Meta Quest 3	Hands + body (limited)	~$500
Dexterous Gloves (e.g., Manus)	Full finger data	~$5,000+

Workflow thu data:

1. Connect HMD/Gloves → xr_teleoperate daemon
2. Teleop robot thực hiện task (vừa nhìn video stream vừa điều khiển)
3. Data tự động ghi vào format: 
   {joint_positions, end_effector_poses, camera_frames, timestamps}
4. Convert sang LeRobot format cho training

unifolm-vla (~477 stars)

Repo: unitreerobotics/unifolm-vla

Là gì: VLA model cho Unitree humanoid (G1, H1) — pretrain + fine-tune pipeline. Tên "unifolm" = Unified Foundation Model for Loco-Manipulation.

Architecture:

Backbone: InternVL2 (CASIA + Shanghai AI Lab)
Action head: diffusion policy (continuous)
Frequency: ~10Hz (upper body), 200Hz (locomotion)
Control: upper body end-effector + locomotion velocity command

Cách dùng:

git clone https://github.com/unitreerobotics/unifolm-vla.git
cd unifolm-vla

# Fine-tune với data từ xr_teleoperate
python train.py \
  --model_name unifolm_base \
  --dataset_path /path/to/lerobot_dataset \
  --robot_type g1 \
  --output_dir ./finetuned

# Deploy (cần Unitree G1 hoặc H1 + onboard compute)
python deploy.py \
  --checkpoint ./finetuned/checkpoint_best.pth \
  --robot_type g1

Lưu ý: Pretrained checkpoint chưa public ở thời điểm viết bài. Repo có code train từ đầu hoặc fine-tune từ InternVL2 checkpoint.

Unitree pipeline hoàn chỉnh

Ba repos của Unitree tạo thành một pipeline khép kín:

[unitree_rl_gym]          → locomotion baseline (sim)
        ↓
[xr_teleoperate]           → thu data manipulation (real)
        ↓
[unifolm-vla]              → VLA fine-tune + deploy

Thanh Hoa University (THU): RDT-1B (~1.7k stars)

Repo: thu-ml/RoboticsDiffusionTransformer

Là gì: RDT-1B — foundation model cho bimanual manipulation dùng diffusion transformer. Hiện tại là một trong những model bimanual mạnh nhất (benchmark LIBERO, RLBench).

Architecture:

Backbone: DiT (Diffusion Transformer, 1B params)
Inputs: stereo cameras + wrist cameras + language
Action: diffusion denoising → 7-DoF joint actions (x2 arms)
Frequency: ~25Hz

Datasets dùng để train:

Bridge V2, DROID, Open X-Embodiment
LIBERO benchmark suite
Custom bimanual teleoperation data (ALOHA platform)

Cách dùng:

git clone https://github.com/thu-ml/RoboticsDiffusionTransformer.git
cd RoboticsDiffusionTransformer
pip install -r requirements.txt

# Download pretrained checkpoint
python scripts/download_model.py --model rdt-1b

# Inference
python inference.py \
  --model_path checkpoints/rdt-1b \
  --image path/to/obs.jpg \
  --instruction "fold the cloth in half"

Fine-tuning: repo có script fine-tune với custom data. Format: HDF5 hoặc RLDS.

Paper: RDT-1B: A Diffusion Foundation Model for Bimanual Manipulation (2024)

RobotEra: humanoid-gym (~2k stars)

Repo: roboterax/humanoid-gym

Là gì: RL training framework cho humanoid (dựa trên legged_gym, similar to unitree_rl_gym) nhưng focused hơn vào sim2real transfer với XBot-L robot của RobotEra.

Điểm khác biệt:

XBot-L specific (không support nhiều robots như unitree_rl_gym)
Focus mạnh hơn vào sim2real gap: domain randomization, actuator modeling
Có detailed tutorial về tuning cho real robot

Dùng tốt khi: bạn muốn học về sim2real transfer techniques hoặc dùng XBot-L.

Benchmark: humanoid-bench (~772 stars)

Repo: carlosferrazza/humanoid-bench

Là gì: Benchmark suite cho humanoid robot tasks — tiêu chuẩn để evaluate performance và so sánh các methods.

Tasks trong benchmark:

Stand, walk, run (locomotion only)
Reach, grasp, place (manipulation only)
Walk-then-grasp (loco-manipulation)
Door open, drawer open (long-horizon)

Humanoid models có sẵn: Unitree H1, G1, Agility Digit, custom models.

Dùng để làm gì:

git clone https://github.com/carlosferrazza/humanoid-bench.git
cd humanoid-bench
pip install -e .

# Evaluate policy
python evaluate.py \
  --task "walk_and_grasp" \
  --policy path/to/your/policy \
  --num_episodes 100

Quan trọng: nếu bạn train một policy mới, chạy humanoid-bench để có số liệu benchmark so sánh với các papers. Rất khó publish nếu không có benchmark numbers.

Paper không có repo: LeVERB và WoCoCo

Một số công trình quan trọng hiện chưa có public repo:

LeVERB

Gì: Language-conditioned Whole-Body Control from Videos
Khi nào: Paper tháng 4/2026
Approach: Học WBC policy từ internet video (không cần robot demo)
Status: Chưa có public code — chờ

WoCoCo

Gì: Learning Whole-Body Humanoid Control with Sequential Contacts (CoRL 2024)
Nơi: Peking University
Approach: RL với contact sequence planning
Status: Chưa có public repo

ExBody2 (Tsinghua)

Gì: Extended body imitation learning (full body + expression)
Status: Paper + code đang chuẩn bị release

So sánh tổng hợp — nhóm Trung Quốc + open benchmarks

Repo	Bài toán	Barrier	Ghi chú
unitree_rl_gym	Locomotion sim	Thấp	Cần Isaac Gym
xr_teleoperate	Data collection	Trung bình	Cần HMD hoặc gloves
unifolm-vla	VLA cho Unitree	Cao	Cần Unitree G1/H1
RDT-1B	Bimanual VLA	Thấp	Không cần humanoid
humanoid-gym	Sim training	Thấp	XBot-L focused
humanoid-bench	Evaluation	Thấp	Không cần real robot

Nhận xét so sánh nhóm Mỹ vs Trung Quốc

Nhóm Mỹ (NVIDIA, Physical Intelligence, Berkeley) mạnh về:

Foundation model quality và generalization
Research novelty (EgoHumanoid, HumanPlus)
Open culture (nhiều paper + code + dataset cùng lúc)

Nhóm Trung Quốc (Unitree, THU) mạnh về:

Hardware-software co-design (Unitree làm cả robot lẫn software)
Cost-effective hardware (G1 rẻ hơn 5-10x so với GR1, Figure)
Large-scale deployment (Unitree ship nhiều robot hơn bất kỳ ai)

Thực tế: nhiều lab Mỹ đang dùng Unitree G1/H1 làm platform và fine-tune openpi hoặc GR00T lên đó — kết hợp hardware Trung Quốc + software Mỹ.

Tổng kết series

Từ 3 bài, bạn có bức tranh toàn cảnh:

Bài 1: Taxonomy, cách chọn điểm bắt đầu theo hardware
Bài 2: Nhóm Mỹ — foundation models mạnh, research-first
Bài này: Nhóm Trung Quốc — deployment-first, hardware-integrated stack

Recommendation cuối: nếu bạn muốn bắt đầu ngay với budget thực tế:

Hardware: Unitree G1 (cheapest full humanoid)
Data collection: xr_teleoperate với Meta Quest 3
VLA: fine-tune openpi (best generalization) hoặc unifolm-vla (Unitree-native)
WBC: GR00T-WholeBodyControl hoặc unitree_rl_gym locomotion + custom arm control

VLA + WBC repos từ Trung Quốc: Unitree, THU RDT-1B, và cộng đồng mở

Unitree Robotics: full stack từ sim đến deploy

unitree_rl_gym (~3.3k stars)

xr_teleoperate (~1.5k stars)

unifolm-vla (~477 stars)

Unitree pipeline hoàn chỉnh

Thanh Hoa University (THU): RDT-1B (~1.7k stars)

RobotEra: humanoid-gym (~2k stars)

Benchmark: humanoid-bench (~772 stars)

Paper không có repo: LeVERB và WoCoCo

LeVERB

WoCoCo

ExBody2 (Tsinghua)

So sánh tổng hợp — nhóm Trung Quốc + open benchmarks

Nhận xét so sánh nhóm Mỹ vs Trung Quốc

Tổng kết series

Nguồn tham khảo

Bài viết liên quan

Nguyễn Anh Tuấn

Bài viết liên quan

Bản đồ repos VLA + WBC 2025-2026: tổng quan các GitHub repos humanoid

VLA + WBC repos từ Mỹ: NVIDIA GR00T, openpi, HumanPlus, TeleVision

unifolm-vla + Unitree G1 (Bài 5): deploy inference server, SSH tunnel, và locomotion song song

VLA + WBC repos từ Trung Quốc: Unitree, THU RDT-1B, và cộng đồng mở

Unitree Robotics: full stack từ sim đến deploy

unitree_rl_gym (~3.3k stars)

xr_teleoperate (~1.5k stars)

unifolm-vla (~477 stars)

Unitree pipeline hoàn chỉnh

Thanh Hoa University (THU): RDT-1B (~1.7k stars)

RobotEra: humanoid-gym (~2k stars)

Benchmark: humanoid-bench (~772 stars)

Paper không có repo: LeVERB và WoCoCo

LeVERB

WoCoCo

ExBody2 (Tsinghua)

So sánh tổng hợp — nhóm Trung Quốc + open benchmarks

Nhận xét so sánh nhóm Mỹ vs Trung Quốc

Tổng kết series

Nguồn tham khảo

Bài viết liên quan

Nguyễn Anh Tuấn

Bài viết liên quan

Bản đồ repos VLA + WBC 2025-2026: tổng quan các GitHub repos humanoid

VLA + WBC repos từ Mỹ: NVIDIA GR00T, openpi, HumanPlus, TeleVision

unifolm-vla + Unitree G1 (Bài 5): deploy inference server, SSH tunnel, và locomotion song song