PEFT/LoRA Fine-tuning & VLA Deployment

Introduction: From Research to Product

Throughout the VLA & LeRobot Mastery series, we have covered the entire journey: from understanding the framework, collecting data, training models like SmolVLA and Pi0-FAST, to real-robot RL with HIL-SERL. But everything has been in the "running on a dev machine, checking results in the terminal" stage.

This final post will close the series with a production-ready workflow — techniques for taking VLAs from lab to reality:

PEFT/LoRA fine-tuning — training large models on consumer GPUs
Real-Time Chunking — smooth deployment on real robots
Async inference, streaming encoding — production performance optimization
Plugin system & EnvHub — extending LeRobot for custom projects

Production deployment workflow

Part 1: PEFT/LoRA — Train Smarter, Not Harder

The Problem: VLAs Are Too Large to Fine-tune

Modern VLA models are substantial in size:

Model	Parameters	Full Fine-tune VRAM	Notes
Pi0	~3B	~24GB	Requires RTX 4090 or A100
Pi0-FAST	~3B	~24GB	Similar to Pi0
SmolVLA	~500M	~8GB	Smaller but still significant

Full fine-tuning — updating all parameters — requires expensive GPUs and long training times. For most practical applications, you only need to adapt the model for a specific task, not change the entire knowledge the model has learned.

LoRA: A Brilliantly Simple Idea

LoRA (Low-Rank Adaptation) solves this problem with an elegant idea: instead of updating the large weight matrix W (size d x d), we add two small matrices A (d x r) and B (r x d), where r << d.

Output = W * x + (A * B) * x
         ^         ^
      Frozen     Trainable
                 (very small)

With rank r = 8 and d = 4096, the trainable parameters are:

Full: 4096 x 4096 = 16.7M per layer
LoRA: (4096 x 8) + (8 x 4096) = 65.5K per layer
A ~255x reduction in trainable parameters!

LoRA is applied to attention layers (Q, K, V, O projections) — where most of the model's "knowledge" resides. Everything else is completely frozen.

Enabling PEFT in LeRobot v0.5

LeRobot v0.5 integrates PEFT natively — just add a flag:

lerobot-train \
    --policy.type=pi0 \
    --policy.peft_config.use_peft=true \
    --dataset.repo_id=your-username/pickup-dataset \
    --policy.device=cuda

That is all. LeRobot automatically:

Loads pretrained weights for Pi0
Freezes the entire model
Adds LoRA adapters to attention layers
Trains only the LoRA parameters

Customizing the LoRA Config

For more fine-grained control:

lerobot-train \
    --policy.type=pi0 \
    --policy.peft_config.use_peft=true \
    --policy.peft_config.lora_rank=16 \
    --policy.peft_config.lora_alpha=32 \
    --policy.peft_config.target_modules="q_proj,v_proj,k_proj,o_proj" \
    --policy.peft_config.lora_dropout=0.05 \
    --dataset.repo_id=your-username/pickup-dataset \
    --training.batch_size=16 \
    --training.num_epochs=100 \
    --policy.device=cuda

Parameter explanations:

Parameter	Default	Meaning
`lora_rank`	8	Rank of the LoRA matrices. Higher = more expressive but uses more VRAM
`lora_alpha`	16	Scaling factor. Typically set to 2x the rank
`target_modules`	q,v projections	Which layers get LoRA. Adding k,o improves results but costs more
`lora_dropout`	0.0	Regularization. 0.05 helps prevent overfitting on small datasets

Comparison: PEFT vs Full Fine-tuning

Here are benchmark results on a pick-and-place task with Pi0:

Metric	Full Fine-tune	LoRA (r=8)	LoRA (r=16)
VRAM usage	24GB	6GB	8GB
Trainable params	3B (100%)	15M (0.5%)	30M (1%)
Time per epoch	45 min	12 min	15 min
Success rate	92%	89%	91%
Minimum GPU	RTX 4090	RTX 3060	RTX 3070

The key takeaway: LoRA r=16 achieves a 91% success rate — only 1% below full fine-tuning while requiring 3x less VRAM and being 3x faster. This is the sweet spot for most applications.

PEFT for SmolVLA

SmolVLA is already small (~500M params), but PEFT is still useful when:

Your GPU has only 4GB VRAM (Jetson Nano/Orin)
You need fast training across multiple different tasks
You want to keep the base model and swap LoRA adapters

lerobot-train \
    --policy.type=smolvla \
    --policy.peft_config.use_peft=true \
    --policy.peft_config.lora_rank=8 \
    --dataset.repo_id=your-username/dataset \
    --policy.device=cuda

SmolVLA + LoRA requires only ~3GB VRAM — it runs on a Jetson Orin Nano!

Part 2: Deploying to Real Robots — Real-Time Chunking

The Problem with Naive Deployment

When deploying a policy to a real robot in the simplest way:

while True:
    obs = robot.get_observation()
    action = policy.predict(obs)  # Inference takes 100-200ms
    robot.execute(action)

You encounter two problems:

Latency: Each prediction takes 100-200ms, making the robot react slowly
Jerky motion: Each action chunk (sequence of actions) starts "from scratch", causing rough transitions

Real-Time Chunking Solves Both

Real-Time Chunking (RTC) is a technique that continuously blends old predictions that are being executed with new predictions that were just computed.

Instead of:

Predict -> Execute all -> Predict -> Execute all -> ...

RTC does:

Predict chunk 1 -> Start executing
                    -> Predict chunk 2 (while executing chunk 1)
                    -> Blend chunk 1 remaining + chunk 2 start
                    -> Continue executing blended actions
                    -> Predict chunk 3...

Result: the robot moves continuously and smoothly, with no "jerks" between chunks.

Enabling RTC in LeRobot

lerobot-record \
    --policy.path=your-username/pi0-pickup-lora \
    --policy.rtc_config.enabled=true \
    --robot.type=so100_follower \
    --robot.port=/dev/ttyACM0 \
    --cameras.top.port=/dev/video0

Just --policy.rtc_config.enabled=true — LeRobot handles the rest.

RTC is compatible with:

Pi0
Pi0-FAST
SmolVLA
Diffusion Policy
Any policy that outputs action chunks

How RTC Works Under the Hood

Time ->      t0    t1    t2    t3    t4    t5    t6
             |     |     |     |     |     |     |
Chunk 1:     [a1   a2    a3    a4    a5]
Chunk 2:           [b1   b2    b3    b4    b5]
Chunk 3:                 [c1   c2    c3    c4    c5]
             |     |     |     |     |     |     |
Executed:    a1   blend  blend blend blend ...
                  (a2,b1)(a3,b2,c1)

At each timestep, RTC takes a weighted average of all available predictions for that timestep. Newer predictions receive higher weights (because they are based on the most recent observation).

Blending formula:

action(t) = sum(w_i * chunk_i(t)) / sum(w_i)

Where w_i decreases for older chunks. This is essentially an exponential moving average over the action space.

Part 3: Production Performance Optimization

Async Inference for SmolVLA

SmolVLA supports asynchronous inference — separating image processing (vision encoder) and generation (action decoder) into a parallel pipeline.

In synchronous mode:

[Vision encode: 50ms] -> [Action decode: 100ms] -> Total: 150ms

In async mode:

Frame 1: [Vision encode: 50ms] -> [Action decode: 100ms]
Frame 2:           [Vision encode: 50ms] -> [Action decode: 100ms]
                                   ^
                        Runs in parallel with frame 1 decode

Result: 2x throughput, ~30% latency reduction.

Enable it:

lerobot-record \
    --policy.path=your-username/smolvla-model \
    --policy.async_inference=true \
    --policy.rtc_config.enabled=true \
    --robot.type=so100_follower

Streaming Video Encoding

LeRobot v0.5 adds streaming video encoding — encoding video continuously during data collection instead of waiting until the end of each episode.

Before v0.5:

Record episode (30s) -> Wait for encoding (10-15s) -> Next episode

With streaming encoding:

Record episode (30s) -> Immediately start next episode
                       ^
              Encoding runs in background, zero wait

This is especially important for HIL-SERL — where the robot needs continuous, uninterrupted data collection.

Streaming encoding is enabled by default in v0.5. No additional configuration needed.

Optimized deployment pipeline

Part 4: Complete Production Workflow

The 5-Step Workflow

Here is a production-ready process you can apply to any manipulation task:

Step 1: Collect Data

# Teleop with leader arm or gamepad
lerobot-record \
    --robot.type=so100_follower \
    --teleop.type=so100_leader \
    --dataset.repo_id=your-username/task-v1 \
    --fps=30 \
    --num_episodes=50

Step 2: Fine-tune with LoRA

# LoRA fine-tune Pi0 — only needs RTX 3060
lerobot-train \
    --policy.type=pi0 \
    --policy.peft_config.use_peft=true \
    --policy.peft_config.lora_rank=16 \
    --dataset.repo_id=your-username/task-v1 \
    --training.batch_size=8 \
    --training.num_epochs=100 \
    --training.save_freq=10 \
    --wandb.enable=true \
    --wandb.project=lerobot-production \
    --output_dir=./checkpoints/task-v1 \
    --policy.device=cuda

Step 3: Evaluate Offline

# Run evaluation on test episodes
lerobot-eval \
    --policy.path=./checkpoints/task-v1/best \
    --dataset.repo_id=your-username/task-v1-test \
    --output_dir=./eval_results

Check metrics:

Success rate > 80%: proceed with deployment
Success rate 60-80%: collect more data or increase LoRA rank
Success rate < 60%: check data quality, review config

Step 4: Deploy with RTC

# Deploy to real robot
lerobot-record \
    --policy.path=./checkpoints/task-v1/best \
    --policy.rtc_config.enabled=true \
    --robot.type=so100_follower \
    --robot.port=/dev/ttyACM0 \
    --cameras.top.port=/dev/video0

Step 5: Iterate

If the policy is not good enough on the real robot:

Option A: Collect more data on difficult scenarios and continue fine-tuning

# Collect 20 more episodes on difficult cases
lerobot-record \
    --dataset.repo_id=your-username/task-v1-hard-cases \
    --num_episodes=20

# Continue fine-tuning from previous checkpoint
lerobot-train \
    --policy.path=./checkpoints/task-v1/best \
    --policy.peft_config.use_peft=true \
    --dataset.repo_id=your-username/task-v1-hard-cases \
    --training.num_epochs=50

Option B: Use HIL-SERL for RL-based improvement

# Switch to RL fine-tuning
python -m lerobot.rl.learner --config_path rl_config.json
python -m lerobot.rl.actor --config_path rl_config.json

Versioning Models on HuggingFace Hub

LeRobot integrates deeply with HuggingFace Hub. After training, push your model:

# Push model to HuggingFace Hub
huggingface-cli upload your-username/pi0-pickup-v1 ./checkpoints/task-v1/best

# Anyone can use it immediately
lerobot-record \
    --policy.path=your-username/pi0-pickup-v1 \
    --policy.rtc_config.enabled=true

Each version is a repository on the Hub — you can track history, compare versions, and roll back when needed.

Monitoring with Weights & Biases

Track training in real-time:

lerobot-train \
    --wandb.enable=true \
    --wandb.project=lerobot-production \
    --wandb.name=pi0-pickup-lora-r16-v2

W&B will log:

Loss curves (actor loss, critic loss if using SAC)
Learning rate schedule
Gradient norms
Evaluation metrics (success rate, average reward)
GPU utilization and memory

Part 5: Extending LeRobot — Plugins & EnvHub

3rd-Party Policy Plugins

LeRobot v0.5 introduces a plugin system — allowing you to register custom policies as pip packages.

For example, you develop a new policy called MyCustomPolicy. Instead of forking LeRobot, you create a separate pip package:

# my_policy_package/policy.py
from lerobot.common.policies.base import BasePolicy

class MyCustomPolicy(BasePolicy):
    def __init__(self, config):
        super().__init__(config)
        # Custom architecture here
    
    def forward(self, batch):
        # Custom forward pass
        pass
    
    def predict_action(self, observation):
        # Custom inference
        pass

# my_policy_package/setup.py (or pyproject.toml)
# Register entry point
entry_points={
    "lerobot.policies": [
        "my_custom=my_policy_package.policy:MyCustomPolicy"
    ]
}

After installing the package:

pip install my-policy-package

# Use it directly in LeRobot
lerobot-train --policy.type=my_custom --dataset.repo_id=...

The plugin system is extremely powerful because:

No forking LeRobot — upstream updates do not cause conflicts
Easy sharing — publish to PyPI, anyone can install
Separation of concerns — policy code is separate, training infrastructure uses LeRobot

EnvHub: Load Simulation Environments from HuggingFace

EnvHub allows you to load gym environments directly from HuggingFace Hub — no separate installation needed:

# Load environment from Hub
lerobot-train \
    --env.type=hub \
    --env.repo_id=lerobot/simxarm \
    --policy.type=diffusion

LeRobot will:

Download the environment package from the Hub
Automatically install dependencies
Initialize the environment
Start training

This is an important step toward democratizing robot learning — anyone can create and share simulation environments, just like sharing datasets on HuggingFace.

Environments available on the Hub:

lerobot/simxarm — XArm manipulation tasks
lerobot/aloha-sim — ALOHA bimanual manipulation
lerobot/pusht — Push-T benchmark
Community environments are growing rapidly

Series Recap: From Zero to Production

Looking back at the entire VLA & LeRobot Mastery series, we have covered a complete journey:

Phase	Post	Key Content
Foundation	Post 1: Framework	LeRobot architecture, dataset format, policy zoo
New features	Post 11: v0.5 Overview	SmolVLA, HIL-SERL, PEFT, RTC
Training	Post 12: SmolVLA	Training a compact VLA
Real-world RL	Post 14: HIL-SERL	RL on real robots with human interventions
Production	Post 15 (this post)	LoRA, deployment, optimization, plugins

Roadmap Ahead

LeRobot is evolving rapidly. Here is what to look forward to:

Multi-task training — one policy for multiple tasks, switching via language commands
Humanoid support — expanding from single arm to whole-body control
Better sim-to-real — stronger transfer learning from simulation to real robots
Larger pretrained models — foundation models for robotics, similar to GPT for NLP
Edge deployment — running on Jetson, Raspberry Pi, FPGA

Final Advice

If you are just starting with LeRobot, here is the path I recommend:

Week 1: Read post 1, set up LeRobot, run basic tutorials
Week 2: Collect your first dataset with SO-100, train SmolVLA
Week 3: Fine-tune Pi0 with LoRA, deploy with RTC
Week 4: Try HIL-SERL if the policy is not good enough
Week 5+: Optimize, add tasks, share on the Hub

Robot learning is in its "ChatGPT moment" — powerful tools are becoming accessible to everyone. LeRobot v0.5 is a major step forward in that journey.

Start building. Your robot is waiting.

HIL-SERL: Real Robot RL — The previous step in the workflow: RL directly on real robots
SmolVLA Training Guide — Training a compact VLA as a base for LoRA fine-tuning
LeRobot Ecosystem Guide — A comprehensive overview of the LeRobot ecosystem

Introduction: From Research to Product

This final post will close the series with a production-ready workflow — techniques for taking VLAs from lab to reality:

PEFT/LoRA fine-tuning — training large models on consumer GPUs
Real-Time Chunking — smooth deployment on real robots
Async inference, streaming encoding — production performance optimization
Plugin system & EnvHub — extending LeRobot for custom projects

Production deployment workflow

Part 1: PEFT/LoRA — Train Smarter, Not Harder

The Problem: VLAs Are Too Large to Fine-tune

Modern VLA models are substantial in size:

Model	Parameters	Full Fine-tune VRAM	Notes
Pi0	~3B	~24GB	Requires RTX 4090 or A100
Pi0-FAST	~3B	~24GB	Similar to Pi0
SmolVLA	~500M	~8GB	Smaller but still significant

LoRA: A Brilliantly Simple Idea

LoRA (Low-Rank Adaptation) solves this problem with an elegant idea: instead of updating the large weight matrix W (size d x d), we add two small matrices A (d x r) and B (r x d), where r << d.

Output = W * x + (A * B) * x
         ^         ^
      Frozen     Trainable
                 (very small)

With rank r = 8 and d = 4096, the trainable parameters are:

Full: 4096 x 4096 = 16.7M per layer
LoRA: (4096 x 8) + (8 x 4096) = 65.5K per layer
A ~255x reduction in trainable parameters!

LoRA is applied to attention layers (Q, K, V, O projections) — where most of the model's "knowledge" resides. Everything else is completely frozen.

Enabling PEFT in LeRobot v0.5

LeRobot v0.5 integrates PEFT natively — just add a flag:

lerobot-train \
    --policy.type=pi0 \
    --policy.peft_config.use_peft=true \
    --dataset.repo_id=your-username/pickup-dataset \
    --policy.device=cuda

That is all. LeRobot automatically:

Loads pretrained weights for Pi0
Freezes the entire model
Adds LoRA adapters to attention layers
Trains only the LoRA parameters

Customizing the LoRA Config

For more fine-grained control:

lerobot-train \
    --policy.type=pi0 \
    --policy.peft_config.use_peft=true \
    --policy.peft_config.lora_rank=16 \
    --policy.peft_config.lora_alpha=32 \
    --policy.peft_config.target_modules="q_proj,v_proj,k_proj,o_proj" \
    --policy.peft_config.lora_dropout=0.05 \
    --dataset.repo_id=your-username/pickup-dataset \
    --training.batch_size=16 \
    --training.num_epochs=100 \
    --policy.device=cuda

Parameter explanations:

Parameter	Default	Meaning
`lora_rank`	8	Rank of the LoRA matrices. Higher = more expressive but uses more VRAM
`lora_alpha`	16	Scaling factor. Typically set to 2x the rank
`target_modules`	q,v projections	Which layers get LoRA. Adding k,o improves results but costs more
`lora_dropout`	0.0	Regularization. 0.05 helps prevent overfitting on small datasets

Comparison: PEFT vs Full Fine-tuning

Here are benchmark results on a pick-and-place task with Pi0:

Metric	Full Fine-tune	LoRA (r=8)	LoRA (r=16)
VRAM usage	24GB	6GB	8GB
Trainable params	3B (100%)	15M (0.5%)	30M (1%)
Time per epoch	45 min	12 min	15 min
Success rate	92%	89%	91%
Minimum GPU	RTX 4090	RTX 3060	RTX 3070

The key takeaway: LoRA r=16 achieves a 91% success rate — only 1% below full fine-tuning while requiring 3x less VRAM and being 3x faster. This is the sweet spot for most applications.

PEFT for SmolVLA

SmolVLA is already small (~500M params), but PEFT is still useful when:

Your GPU has only 4GB VRAM (Jetson Nano/Orin)
You need fast training across multiple different tasks
You want to keep the base model and swap LoRA adapters

lerobot-train \
    --policy.type=smolvla \
    --policy.peft_config.use_peft=true \
    --policy.peft_config.lora_rank=8 \
    --dataset.repo_id=your-username/dataset \
    --policy.device=cuda

SmolVLA + LoRA requires only ~3GB VRAM — it runs on a Jetson Orin Nano!

Part 2: Deploying to Real Robots — Real-Time Chunking

The Problem with Naive Deployment

When deploying a policy to a real robot in the simplest way:

while True:
    obs = robot.get_observation()
    action = policy.predict(obs)  # Inference takes 100-200ms
    robot.execute(action)

You encounter two problems:

Latency: Each prediction takes 100-200ms, making the robot react slowly
Jerky motion: Each action chunk (sequence of actions) starts "from scratch", causing rough transitions

Real-Time Chunking Solves Both

Real-Time Chunking (RTC) is a technique that continuously blends old predictions that are being executed with new predictions that were just computed.

Instead of:

Predict -> Execute all -> Predict -> Execute all -> ...

RTC does:

Predict chunk 1 -> Start executing
                    -> Predict chunk 2 (while executing chunk 1)
                    -> Blend chunk 1 remaining + chunk 2 start
                    -> Continue executing blended actions
                    -> Predict chunk 3...

Result: the robot moves continuously and smoothly, with no "jerks" between chunks.

Enabling RTC in LeRobot

lerobot-record \
    --policy.path=your-username/pi0-pickup-lora \
    --policy.rtc_config.enabled=true \
    --robot.type=so100_follower \
    --robot.port=/dev/ttyACM0 \
    --cameras.top.port=/dev/video0

Just --policy.rtc_config.enabled=true — LeRobot handles the rest.

RTC is compatible with:

Pi0
Pi0-FAST
SmolVLA
Diffusion Policy
Any policy that outputs action chunks

How RTC Works Under the Hood

Time ->      t0    t1    t2    t3    t4    t5    t6
             |     |     |     |     |     |     |
Chunk 1:     [a1   a2    a3    a4    a5]
Chunk 2:           [b1   b2    b3    b4    b5]
Chunk 3:                 [c1   c2    c3    c4    c5]
             |     |     |     |     |     |     |
Executed:    a1   blend  blend blend blend ...
                  (a2,b1)(a3,b2,c1)

At each timestep, RTC takes a weighted average of all available predictions for that timestep. Newer predictions receive higher weights (because they are based on the most recent observation).

Blending formula:

action(t) = sum(w_i * chunk_i(t)) / sum(w_i)

Where w_i decreases for older chunks. This is essentially an exponential moving average over the action space.

Part 3: Production Performance Optimization

Async Inference for SmolVLA

SmolVLA supports asynchronous inference — separating image processing (vision encoder) and generation (action decoder) into a parallel pipeline.

In synchronous mode:

[Vision encode: 50ms] -> [Action decode: 100ms] -> Total: 150ms

In async mode:

Frame 1: [Vision encode: 50ms] -> [Action decode: 100ms]
Frame 2:           [Vision encode: 50ms] -> [Action decode: 100ms]
                                   ^
                        Runs in parallel with frame 1 decode

Result: 2x throughput, ~30% latency reduction.

Enable it:

lerobot-record \
    --policy.path=your-username/smolvla-model \
    --policy.async_inference=true \
    --policy.rtc_config.enabled=true \
    --robot.type=so100_follower

Streaming Video Encoding

LeRobot v0.5 adds streaming video encoding — encoding video continuously during data collection instead of waiting until the end of each episode.

Before v0.5:

Record episode (30s) -> Wait for encoding (10-15s) -> Next episode

With streaming encoding:

Record episode (30s) -> Immediately start next episode
                       ^
              Encoding runs in background, zero wait

This is especially important for HIL-SERL — where the robot needs continuous, uninterrupted data collection.

Streaming encoding is enabled by default in v0.5. No additional configuration needed.

Optimized deployment pipeline

Part 4: Complete Production Workflow

The 5-Step Workflow

Here is a production-ready process you can apply to any manipulation task:

Step 1: Collect Data

# Teleop with leader arm or gamepad
lerobot-record \
    --robot.type=so100_follower \
    --teleop.type=so100_leader \
    --dataset.repo_id=your-username/task-v1 \
    --fps=30 \
    --num_episodes=50

Step 2: Fine-tune with LoRA

# LoRA fine-tune Pi0 — only needs RTX 3060
lerobot-train \
    --policy.type=pi0 \
    --policy.peft_config.use_peft=true \
    --policy.peft_config.lora_rank=16 \
    --dataset.repo_id=your-username/task-v1 \
    --training.batch_size=8 \
    --training.num_epochs=100 \
    --training.save_freq=10 \
    --wandb.enable=true \
    --wandb.project=lerobot-production \
    --output_dir=./checkpoints/task-v1 \
    --policy.device=cuda

Step 3: Evaluate Offline

# Run evaluation on test episodes
lerobot-eval \
    --policy.path=./checkpoints/task-v1/best \
    --dataset.repo_id=your-username/task-v1-test \
    --output_dir=./eval_results

Check metrics:

Success rate > 80%: proceed with deployment
Success rate 60-80%: collect more data or increase LoRA rank
Success rate < 60%: check data quality, review config

Step 4: Deploy with RTC

# Deploy to real robot
lerobot-record \
    --policy.path=./checkpoints/task-v1/best \
    --policy.rtc_config.enabled=true \
    --robot.type=so100_follower \
    --robot.port=/dev/ttyACM0 \
    --cameras.top.port=/dev/video0

Step 5: Iterate

If the policy is not good enough on the real robot:

Option A: Collect more data on difficult scenarios and continue fine-tuning

# Collect 20 more episodes on difficult cases
lerobot-record \
    --dataset.repo_id=your-username/task-v1-hard-cases \
    --num_episodes=20

# Continue fine-tuning from previous checkpoint
lerobot-train \
    --policy.path=./checkpoints/task-v1/best \
    --policy.peft_config.use_peft=true \
    --dataset.repo_id=your-username/task-v1-hard-cases \
    --training.num_epochs=50

Option B: Use HIL-SERL for RL-based improvement

# Switch to RL fine-tuning
python -m lerobot.rl.learner --config_path rl_config.json
python -m lerobot.rl.actor --config_path rl_config.json

Versioning Models on HuggingFace Hub

LeRobot integrates deeply with HuggingFace Hub. After training, push your model:

# Push model to HuggingFace Hub
huggingface-cli upload your-username/pi0-pickup-v1 ./checkpoints/task-v1/best

# Anyone can use it immediately
lerobot-record \
    --policy.path=your-username/pi0-pickup-v1 \
    --policy.rtc_config.enabled=true

Each version is a repository on the Hub — you can track history, compare versions, and roll back when needed.

Monitoring with Weights & Biases

Track training in real-time:

lerobot-train \
    --wandb.enable=true \
    --wandb.project=lerobot-production \
    --wandb.name=pi0-pickup-lora-r16-v2

W&B will log:

Loss curves (actor loss, critic loss if using SAC)
Learning rate schedule
Gradient norms
Evaluation metrics (success rate, average reward)
GPU utilization and memory

Part 5: Extending LeRobot — Plugins & EnvHub

3rd-Party Policy Plugins

LeRobot v0.5 introduces a plugin system — allowing you to register custom policies as pip packages.

For example, you develop a new policy called MyCustomPolicy. Instead of forking LeRobot, you create a separate pip package:

# my_policy_package/policy.py
from lerobot.common.policies.base import BasePolicy

class MyCustomPolicy(BasePolicy):
    def __init__(self, config):
        super().__init__(config)
        # Custom architecture here
    
    def forward(self, batch):
        # Custom forward pass
        pass
    
    def predict_action(self, observation):
        # Custom inference
        pass

# my_policy_package/setup.py (or pyproject.toml)
# Register entry point
entry_points={
    "lerobot.policies": [
        "my_custom=my_policy_package.policy:MyCustomPolicy"
    ]
}

After installing the package:

pip install my-policy-package

# Use it directly in LeRobot
lerobot-train --policy.type=my_custom --dataset.repo_id=...

The plugin system is extremely powerful because:

No forking LeRobot — upstream updates do not cause conflicts
Easy sharing — publish to PyPI, anyone can install
Separation of concerns — policy code is separate, training infrastructure uses LeRobot

EnvHub: Load Simulation Environments from HuggingFace

EnvHub allows you to load gym environments directly from HuggingFace Hub — no separate installation needed:

# Load environment from Hub
lerobot-train \
    --env.type=hub \
    --env.repo_id=lerobot/simxarm \
    --policy.type=diffusion

LeRobot will:

Download the environment package from the Hub
Automatically install dependencies
Initialize the environment
Start training

This is an important step toward democratizing robot learning — anyone can create and share simulation environments, just like sharing datasets on HuggingFace.

Environments available on the Hub:

lerobot/simxarm — XArm manipulation tasks
lerobot/aloha-sim — ALOHA bimanual manipulation
lerobot/pusht — Push-T benchmark
Community environments are growing rapidly

Series Recap: From Zero to Production

Looking back at the entire VLA & LeRobot Mastery series, we have covered a complete journey:

Phase	Post	Key Content
Foundation	Post 1: Framework	LeRobot architecture, dataset format, policy zoo
New features	Post 11: v0.5 Overview	SmolVLA, HIL-SERL, PEFT, RTC
Training	Post 12: SmolVLA	Training a compact VLA
Real-world RL	Post 14: HIL-SERL	RL on real robots with human interventions
Production	Post 15 (this post)	LoRA, deployment, optimization, plugins

Roadmap Ahead

LeRobot is evolving rapidly. Here is what to look forward to:

Multi-task training — one policy for multiple tasks, switching via language commands
Humanoid support — expanding from single arm to whole-body control
Better sim-to-real — stronger transfer learning from simulation to real robots
Larger pretrained models — foundation models for robotics, similar to GPT for NLP
Edge deployment — running on Jetson, Raspberry Pi, FPGA

Final Advice

If you are just starting with LeRobot, here is the path I recommend:

Week 1: Read post 1, set up LeRobot, run basic tutorials
Week 2: Collect your first dataset with SO-100, train SmolVLA
Week 3: Fine-tune Pi0 with LoRA, deploy with RTC
Week 4: Try HIL-SERL if the policy is not good enough
Week 5+: Optimize, add tasks, share on the Hub

Robot learning is in its "ChatGPT moment" — powerful tools are becoming accessible to everyone. LeRobot v0.5 is a major step forward in that journey.

Start building. Your robot is waiting.

HIL-SERL: Real Robot RL — The previous step in the workflow: RL directly on real robots
SmolVLA Training Guide — Training a compact VLA as a base for LoRA fine-tuning
LeRobot Ecosystem Guide — A comprehensive overview of the LeRobot ecosystem

Introduction: From Research to Product

Part 1: PEFT/LoRA — Train Smarter, Not Harder

The Problem: VLAs Are Too Large to Fine-tune

LoRA: A Brilliantly Simple Idea

Enabling PEFT in LeRobot v0.5

Customizing the LoRA Config

Comparison: PEFT vs Full Fine-tuning

PEFT for SmolVLA

Part 2: Deploying to Real Robots — Real-Time Chunking

The Problem with Naive Deployment

Real-Time Chunking Solves Both

Enabling RTC in LeRobot

How RTC Works Under the Hood

Part 3: Production Performance Optimization

Async Inference for SmolVLA

Streaming Video Encoding

Part 4: Complete Production Workflow

The 5-Step Workflow

Step 1: Collect Data

Step 2: Fine-tune with LoRA

Step 3: Evaluate Offline

Step 4: Deploy with RTC

Step 5: Iterate

Versioning Models on HuggingFace Hub

Monitoring with Weights & Biases

Part 5: Extending LeRobot — Plugins & EnvHub

3rd-Party Policy Plugins

EnvHub: Load Simulation Environments from HuggingFace

Series Recap: From Zero to Production

Roadmap Ahead

Final Advice

Related Posts

Nguyễn Anh Tuấn

Related Posts

Sim-to-Real Transfer: Deploy VLA Policy lên Robot thật

SARM trong LeRobot: Reward Model cho VLA

LeRobot v0.5: Pi0-FAST + G1 Whole-Body Control

Introduction: From Research to Product

Part 1: PEFT/LoRA — Train Smarter, Not Harder

The Problem: VLAs Are Too Large to Fine-tune

LoRA: A Brilliantly Simple Idea

Enabling PEFT in LeRobot v0.5

Customizing the LoRA Config

Comparison: PEFT vs Full Fine-tuning

PEFT for SmolVLA

Part 2: Deploying to Real Robots — Real-Time Chunking

The Problem with Naive Deployment

Real-Time Chunking Solves Both

Enabling RTC in LeRobot

How RTC Works Under the Hood

Part 3: Production Performance Optimization

Async Inference for SmolVLA

Streaming Video Encoding

Part 4: Complete Production Workflow

The 5-Step Workflow

Step 1: Collect Data

Step 2: Fine-tune with LoRA

Step 3: Evaluate Offline

Step 4: Deploy with RTC

Step 5: Iterate

Versioning Models on HuggingFace Hub

Monitoring with Weights & Biases

Part 5: Extending LeRobot — Plugins & EnvHub

3rd-Party Policy Plugins

EnvHub: Load Simulation Environments from HuggingFace

Series Recap: From Zero to Production

Roadmap Ahead

Final Advice

Related Posts

Nguyễn Anh Tuấn

Related Posts

Sim-to-Real Transfer: Deploy VLA Policy lên Robot thật

SARM trong LeRobot: Reward Model cho VLA

LeRobot v0.5: Pi0-FAST + G1 Whole-Body Control