humanoidhumanoidvlawhole-bodyresearchgithubnvidiaunitreeopen-source

VLA + WBC repos landscape 2025-2026: all humanoid GitHub repos mapped

A complete map of active GitHub repositories from humanoid companies and research labs working on VLA (Vision-Language-Action) and WBC (Whole-Body Control) — from NVIDIA to Unitree.

Nguyễn Anh TuấnJune 4, 20265 min read
VLA + WBC repos landscape 2025-2026: all humanoid GitHub repos mapped

VLA + WBC repos landscape 2025-2026: all humanoid GitHub repos mapped

Over the past 18 months, the humanoid race has shifted from hardware to software stack — specifically two core problems: VLA (Vision-Language-Action) for manipulation and WBC (Whole-Body Control) for locomotion. Both are converging: robots need to walk and manipulate simultaneously, and understand language to know what to do.

This post maps all active GitHub repositories from companies and research labs — organized by geography and problem type. The next two posts dive deeper into each group.

VLA and WBC: why they need each other

VLA solves: "the robot knows what to do (and how) when it sees a scene and hears a command." Foundation model takes image + language → action vector.

WBC solves: "the robot maintains balance and moves while its arms are working." Unified controller for all 30+ degrees of freedom simultaneously.

Combined: loco-manipulation — humanoid walks to a location, picks up an object, places it somewhere else. This is the practical benchmark every company is racing toward.

2023: VLA + WBC research running in parallel
2024: Starting to combine in labs (HumanPlus, OmniH2O)
2025: Companies start releasing repos (NVIDIA GR00T, Unitree unifolm)
2026: Ecosystem forming, repos interconnecting

All repos by group

US / Western group

Repo Stars Company/Lab Type
physical-intelligence/openpi ~12k Physical Intelligence VLA model (π0, π0.5)
NVIDIA/Isaac-GR00T ~7k NVIDIA VLA foundation model
NVlabs/GR00T-WholeBodyControl ~2.2k NVIDIA WBC platform
NVlabs/HOVER ~742 NVIDIA Humanoid locomotion
MarkFzp/HumanPlus ~847 Berkeley WBC from mocap
OpenTeleVision/TeleVision ~1.3k Stanford + MIT VR teleoperation
OpenHelix-Team/OpenHelix ~378 UIUC + others VLA bimanual
OpenDriveLab/EgoHumanoid ~161 OpenDriveLab Loco-manip from ego demos
thu-ml/RoboticsDiffusionTransformer ~1.7k THU RDT-1B bimanual

Chinese group

Repo Stars Company/Lab Type
unitreerobotics/unifolm-vla ~477 Unitree VLA for G1/H1
unitreerobotics/unitree_rl_gym ~3.3k Unitree RL sim2real
unitreerobotics/xr_teleoperate ~1.5k Unitree XR teleoperation
roboterax/humanoid-gym ~2k RobotEra Humanoid RL training
carlosferrazza/humanoid-bench ~772 MIT (original) Benchmark tasks

No public repo (closed-source / paper only)

Some companies have not released code:

  • Figure AI — Helix system: proprietary
  • Tesla Optimus — full software stack: proprietary
  • Boston Dynamics Atlas — Atlas control: proprietary
  • 1X Technologies — NEO: proprietary
  • AgiBot — mostly proprietary
  • LeVERB — April 2026 paper, no public repo yet

Reading star counts correctly

Star count is a proxy for community interest, not technical quality:

  • openpi (~12k): largest VLA community because Physical Intelligence pioneered π0
  • Isaac-GR00T (~7k): NVIDIA brand + tutorial ready → many want to use it immediately
  • unitree_rl_gym (~3.3k): many people have Unitree hardware → directly applicable repo
  • TeleVision (~1.3k): requires Apple Vision Pro → hardware barrier keeps star count lower
  • EgoHumanoid (~161): just released (RSS 2026), stars will grow

Taxonomy: 4 types of repos

Not all repos are used the same way. Understanding the type prevents misuse:

1. Foundation VLA model (openpi, Isaac-GR00T): pretrained model, you fine-tune with your own data. Needs large GPU to fine-tune, can inference on workstation.

2. WBC platform (GR00T-WholeBodyControl, HOVER): controller set to deploy on real robot. Output is joint commands, input is desired end-effector trajectory.

3. Data collection / teleoperation (xr_teleoperate, TeleVision, HumanPlus): not a policy — it's a data collection pipeline. Output is a dataset for the training step.

4. Simulation / benchmark (unitree_rl_gym, humanoid-gym, humanoid-bench): environments for RL policy training and evaluation. Not directly deployed to real robot.

A real project typically needs all 4 types in pipeline:

[4] Train RL sim baseline → [3] Collect real data → [1] Fine-tune VLA → [2] WBC deploy

Choosing a starting point by hardware

Most practical question: what hardware do you have?

You have Start here
Unitree G1 / H1 unifolm-vla + xr_teleoperate
Unitree Go2 / B2 (quadruped, no arms) unitree_rl_gym for locomotion
Robot arm (Franka, UR5) openpi or UMI series
Any ROS2 robot OpenHelix (multi-robot) or EgoHumanoid
Simulation only humanoid-gym + humanoid-bench
No hardware Isaac-GR00T tutorial (has sim workflow)

What this series covers

Three posts:

  1. This post — Overall landscape, taxonomy, how to choose a starting point
  2. Post 2: US repos — NVIDIA, Berkeley, Stanford, Physical Intelligence — deep dive per repo
  3. Post 3: Chinese repos — Unitree, THU, and the open community — complete Unitree stack

Overall observation

What's interesting when looking at the full ecosystem: the repos don't compete — they complement each other.

openpi provides a strong VLA backbone. GR00T-WholeBodyControl provides the WBC layer. xr_teleoperate provides data collection. humanoid-bench provides evaluation. The best practitioners will be those who can combine multiple repos into a unified pipeline — not those who pick one repo and wait for it to solve everything.

Next: Deep dive into US repos — from NVIDIA to Physical Intelligence.


References


NT

Nguyễn Anh Tuấn

Robotics & AI Engineer. Building VnRobo — sharing knowledge about robot learning, VLA models, and automation.

Khám phá VnRobo

Related Posts

VLA + WBC repos từ Trung Quốc: Unitree, THU RDT-1B, và cộng đồng mở
humanoid

VLA + WBC repos từ Trung Quốc: Unitree, THU RDT-1B, và cộng đồng mở

6/6/20268 min read
NT
VLA + WBC repos từ Mỹ: NVIDIA GR00T, openpi, HumanPlus, TeleVision
humanoid

VLA + WBC repos từ Mỹ: NVIDIA GR00T, openpi, HumanPlus, TeleVision

6/5/20268 min read
NT
Whole-body VLA: kết hợp UMI + mocap/VR để thu data toàn thân
humanoid

Whole-body VLA: kết hợp UMI + mocap/VR để thu data toàn thân

6/6/20268 min read
NT