LeRobot Ecosystem: Hướng dẫn toàn diện 2026

LeRobot là gì?

LeRobot là framework open-source của Hugging Face cho robot learning -- từ thu thập data, train model, đến deploy lên robot thật. Nếu bạn đã quen với Hugging Face Transformers cho NLP, thì LeRobot là Transformers cho robotics.

Ra mắt lần đầu vào 2024, LeRobot đã nhanh chóng trở thành standard framework cho imitation learning trong robotics community. Phiên bản mới nhất v0.5.0 (2026) mang đến nhiều thay đổi lớn: hỗ trợ humanoid, tăng tốc training 10x, và thêm nhiều model/hardware mới.

Bài viết này là hướng dẫn toàn diện về LeRobot ecosystem -- từ architecture, supported models, hardware, đến cách bắt đầu với chỉ $100.

Architecture tổng quan

LeRobot được thiết kế theo kiến trúc modular:

LeRobot Ecosystem
├── Data Collection    → Teleoperation, dataset recording
├── Dataset Format     → LeRobotDataset (HF Hub compatible)
├── Training Pipeline  → Policies (ACT, Diffusion, VLA...)
├── Evaluation         → Sim environments + real robot
└── Deployment         → Real-time inference on hardware

LeRobotDataset

LeRobot sử dụng định dạng dataset chuẩn hóa, tương thích với Hugging Face Hub. Mỗi dataset bao gồm:

Observations: camera images, joint positions, gripper state
Actions: target joint positions hoặc velocities
Metadata: robot type, camera config, task description

Bạn có thể tải hàng nghìn dataset từ Hugging Face Hub -- từ SO-100 arm đến ALOHA bi-manual.

from lerobot.common.datasets.lerobot_dataset import LeRobotDataset

# Tải dataset từ HF Hub
dataset = LeRobotDataset("lerobot/aloha_sim_insertion_human")
print(f"Frames: {len(dataset)}")
print(f"Keys: {dataset[0].keys()}")

Supported Models -- từ ACT đến VLA

LeRobot v0.5.0 hỗ trợ 3 nhóm model chính:

1. Imitation Learning (IL) Policies

Ba IL policies cơ bản, học từ demonstrated behavior:

Model	Kiểu	Đặc điểm	Use case
ACT	Action Chunking Transformer	Predict nhiều actions cùng lúc (chunking), CVAE encoder	Manipulation tasks, bi-manual
Diffusion Policy	Denoising Diffusion	Generate actions qua diffusion process	Multi-modal tasks, robust
VQ-BeT	Vector Quantized BeT	Discretize action space, transformer decoder	Tasks cần multi-modality

ACT (Action Chunking with Transformers) là model phổ biến nhất trong LeRobot. Ý tưởng chính: thay vì predict 1 action tại mỗi timestep, ACT predict một chunk gồm nhiều actions liên tiếp -- giảm temporal inconsistency và cho phép smooth motion.

from lerobot.common.policies.act.configuration_act import ACTConfig
from lerobot.common.policies.act.modeling_act import ACTPolicy

# Tạo ACT policy với config mặc định
config = ACTConfig()
policy = ACTPolicy(config)

2. Vision-Language-Action (VLA) Models

Đây là nhóm model mạnh nhất, kết hợp vision + language + action:

Model	Params	Yêu cầu	Đặc điểm
Pi0	3B	GPU (8GB+ VRAM)	Physical Intelligence, fast inference
Pi0-FAST	3B	GPU	Autoregressive VLA, nhanh hơn Pi0 gốc
Pi0.5	-	GPU	Thế hệ mới từ Physical Intelligence
SmolVLA	~1B	GPU (4GB+ VRAM)	Nhẹ, phù hợp edge device
GR00T N1.5	2.2B	CUDA GPU (FlashAttention)	NVIDIA, cần RTX GPU
XVLA	-	GPU	Cross-embodiment VLA

Lưu ý quan trọng: GR00T N1.5 bắt buộc cần CUDA GPU vì phụ thuộc FlashAttention. Pi0 và SmolVLA linh hoạt hơn về hardware.

3. Real-Time Chunking (RTC)

RTC là kỹ thuật mới trong v0.5.0, cho phép responsive inference -- robot phản ứng nhanh hơn với thay đổi môi trường. RTC tương thích với:

Pi0, Pi0.5, Pi0-FAST
SmolVLA
Diffusion Policy

Supported Hardware -- từ $100 arm đến humanoid

LeRobot v0.5.0 mở rộng mạnh mẽ danh sách hardware được hỗ trợ:

Robot Arms

Hardware	Giá	DOF	Đặc điểm
SO-100	~$100	6	Rẻ nhất, 3D printed, servo motors
SO-101	~$150	6	Cải tiến từ SO-100, tốt hơn
Koch v1.1	~$300	6	Dynamixel servos, chính xác hơn
ALOHA	~$20,000	2x 7DOF	Bi-manual, standard cho research
WidowX	~$3,500	6	Trossen Robotics, ROS 2 compatible

Mobile Robots và Humanoids

Hardware	Loại	Mới trong v0.5?
Unitree G1	Humanoid (23/29 DOF)	Có -- hardware addition lớn nhất
OMX Robot	Mobile manipulation	Có
Earth Rover	Outdoor mobile robot	Có
Reachy 2	Upper body humanoid	Có
SoArm	Compact arm	Có

Unitree G1 support là điểm nhấn của v0.5.0 -- lần đầu tiên LeRobot hỗ trợ full humanoid, với cả 23 DOF và 29 DOF configurations.

Bắt đầu với LeRobot -- Step by step

Bước 1: Cài đặt

# Python 3.10+ required
pip install lerobot

# Hoặc từ source (khuyến nghị để có bản mới nhất)
git clone https://github.com/huggingface/lerobot.git
cd lerobot
pip install -e ".[dev]"

Bước 2: Chạy thử với simulation

Không cần robot thật -- bạn có thể bắt đầu với simulated environments:

# Train ACT policy trên ALOHA sim
python lerobot/scripts/train.py \
  --policy.type=act \
  --dataset.repo_id=lerobot/aloha_sim_insertion_human \
  --output_dir=outputs/act_aloha_sim

Bước 3: Tải và chạy pretrained model

from lerobot.common.policies.act.modeling_act import ACTPolicy

# Tải pretrained model từ HF Hub
policy = ACTPolicy.from_pretrained("lerobot/act_aloha_sim_insertion_human")

# Inference
action = policy.select_action(observation)

Bước 4: Thu thập data trên robot thật

Nếu bạn có SO-100 ($100) hoặc robot arm khác:

# Kết nối và calibrate robot
python lerobot/scripts/control_robot.py \
  --robot.type=so100 \
  --control.type=calibrate

# Teleoperate và record dataset
python lerobot/scripts/control_robot.py \
  --robot.type=so100 \
  --control.type=record \
  --dataset.repo_id=${HF_USER}/my_first_dataset \
  --control.fps=30 \
  --control.num_episodes=50

Bước 5: Train model từ data của bạn

# Train ACT trên dataset của bạn
python lerobot/scripts/train.py \
  --policy.type=act \
  --dataset.repo_id=${HF_USER}/my_first_dataset \
  --output_dir=outputs/my_act_model \
  --training.num_epochs=100

Bước 6: Deploy lên robot

# Chạy policy trên robot thật
python lerobot/scripts/control_robot.py \
  --robot.type=so100 \
  --control.type=replay \
  --policy.path=outputs/my_act_model/checkpoints/last

Train VLA models với LeRobot

VLA models (Pi0, SmolVLA) cần nhiều compute hơn ACT, nhưng cho kết quả tốt hơn trên tasks phức tạp:

# Train SmolVLA (nhẹ nhất trong các VLA)
python lerobot/scripts/train.py \
  --policy.type=smolvla \
  --dataset.repo_id=lerobot/aloha_sim_insertion_human \
  --output_dir=outputs/smolvla_aloha

# Train Pi0-FAST
python lerobot/scripts/train.py \
  --policy.type=pi0fast \
  --dataset.repo_id=lerobot/aloha_sim_insertion_human \
  --output_dir=outputs/pi0fast_aloha

Yêu cầu GPU cho VLA:

SmolVLA: 4GB+ VRAM (có thể chạy trên RTX 3060)
Pi0/Pi0-FAST: 8GB+ VRAM (RTX 3070+)
GR00T N1.5: 16GB+ VRAM + CUDA (RTX 4080+)

LeRobot v0.5.0 -- Có gì mới?

1. Humanoid support (Unitree G1)

Lần đầu tiên, LeRobot hỗ trợ full humanoid robot. Bạn có thể:

Teleoperate Unitree G1 qua LeRobot interface
Thu thập manipulation + locomotion data
Train VLA policies cho whole-body control

2. Tăng tốc training 10x

V0.5.0 tối ưu hóa data loading pipeline và training loop, đạt tốc độ gấp 10 lần so với v0.4.0 trên cùng hardware.

3. Isaac Lab Arena integration

Isaac Lab Arena -- framework cho scalable policy evaluation trong simulation. Bạn có thể:

Chạy thousands of parallel environments
Evaluate policies trên nhiều tasks đồng thời
Generate synthetic training data

4. Consolidated SO-100/SO-101 codebase

SO-100 và SO-101 được gộp vào một codebase chung, bao gồm cả bi-manual setups -- dễ hơn cho người dùng.

Chi phí bắt đầu

Setup	Chi phí	Yêu cầu
Simulation only	$0	Laptop với GPU (hoặc Google Colab)
SO-100 arm	~$100	3D printer + servo motors + camera
SO-100 bi-manual	~$250	2x SO-100 + 2 cameras
Koch v1.1	~$300	Dynamixel servos + camera
GPU cloud	$0.50/hr	Lambda, Vast.ai cho training

Với $100 cho SO-100 và Google Colab Free cho training, bạn có thể bắt đầu robot learning mà không cần đầu tư lớn.

Tips và best practices

1. Bắt đầu với ACT, không phải VLA

ACT đơn giản hơn, train nhanh hơn, và cho kết quả tốt trên most manipulation tasks. Chỉ chuyển sang VLA khi:

Task cần language conditioning ("pick up the red cup")
Cần generalize qua nhiều objects/scenes
Có đủ GPU để train và inference

2. Data quality quan trọng hơn data quantity

50 episodes chất lượng cao tốt hơn 500 episodes xấu. Khi record data:

Giữ consistent speed khi teleoperate
Lighting ổn định, không đổi giữa episodes
Camera angle cố định

3. Dùng LeRobot Hub

Upload dataset lên Hugging Face Hub để:

Chia sẻ với community
Version control cho data
Reproduce kết quả dễ dàng

# Upload dataset lên HF Hub
huggingface-cli upload ${HF_USER}/my_dataset ./outputs/my_dataset

4. Evaluate trong sim trước

Dùng Isaac Lab Arena hoặc MuJoCo sim để evaluate policy trước khi chạy trên robot thật -- tiết kiệm thời gian và tránh hỏng robot.

Community và resources

GitHub: github.com/huggingface/lerobot -- source code, issues, contributions
HF Hub: huggingface.co/lerobot -- pretrained models và datasets
Discord: Hugging Face Discord, channel #lerobot
Docs: huggingface.co/docs/lerobot -- official documentation

Kết luận

LeRobot đã hạ thấp barrier to entry cho robot learning xuống mức chưa từng có. Với $100 cho hardware và framework miễn phí, bất kỳ ai cũng có thể bắt đầu train AI cho robot.

V0.5.0 đưa LeRobot từ "toy project" lên production-grade framework: hỗ trợ humanoid, VLA models, Isaac Lab integration, và tăng tốc 10x. Nếu bạn muốn bắt đầu với robot learning, không có thời điểm nào tốt hơn lúc này.