Collecting Data for OpenArm: From Unboxing to 50 Box-Grasping Episodes

You already understand the theory behind SimpleVLA-RL — a framework that uses Reinforcement Learning to improve VLA models in simulation. You also know why OpenArm is the ideal hardware platform for manipulation experiments. Now it is time for the most critical hands-on step: data collection.

In robotics, data is fuel. Even the most powerful AI model is useless without quality data to learn from. This post walks you through every step — from plugging in CAN bus wires to having 50 box-grasping episodes ready for upload to HuggingFace Hub. If you are a beginner, do not worry — everything is explained in detail, with practical tips that only come from real-world experience.

Hardware Checklist: What Do You Need?

Before starting, verify you have all the required equipment:

Equipment	Description	Estimated Price
OpenArm follower	Robot arm that performs the task — 6 DOF with Damiao motors	~$3,500 (single arm)
OpenArm leader	Control arm for teleoperation — same configuration	Included in bimanual kit (~$6,500)
CAN bus USB adapter	Connects robot to computer via CAN protocol	~$20-50 (CANable, PCAN-USB)
Linux PC with GPU	Ubuntu 22.04+, NVIDIA GPU (minimum RTX 3060)	Your existing machine
USB camera	Standard webcam, 640x480 or higher, 30fps	~$30-100 (Logitech C920+)
Carton boxes	Various sizes (small, medium, large)	Free — use shipping boxes
Workbench	Flat, stable surface with enough room for robot operation	Your existing desk

Important note: If you only have one arm (no leader), you can still collect data using gamepad or keyboard teleoperation. However, leader-follower teleoperation produces significantly better data quality because the motions are more natural and fluid.

OpenArm robot arm setup on workbench

Step 1: CAN Bus Setup — Connecting Robot to Computer

CAN bus (Controller Area Network) is the communication protocol OpenArm uses between the computer and Damiao motors. If you have worked with automotive or industrial systems, you are already familiar with CAN. If not — think of it as a "data highway" connecting the computer to each robot joint.

Wiring

Connect CAN USB adapter #1 to a USB port, then wire CAN-H and CAN-L to the follower arm
Connect CAN USB adapter #2 to a different USB port, then wire CAN to the leader arm
Ensure there is a 120 ohm termination resistor at the end of each bus (many adapters have one built in — check the datasheet)

Interface Configuration

LeRobot provides a built-in CAN setup script. Run these commands:

# Setup CAN interfaces (can0 for follower, can1 for leader)
lerobot-setup-can --mode=setup --interfaces=can0,can1

This command will:

Load the gs_usb (or can_raw) kernel module
Set the bitrate appropriate for Damiao motors (typically 1Mbps)
Bring up the can0 and can1 interfaces

Verify Connection

# Test whether CAN bus is working
lerobot-setup-can --mode=test --interfaces=can0,can1

If successful, you will see output like:

[OK] can0: Detected 6 motors (Damiao DM-J4340-2EC)
[OK] can1: Detected 6 motors (Damiao DM-J4340-2EC)
All interfaces ready!

Common troubleshooting:

"No device found": Check USB cables, try a different USB port
"Permission denied": Add your user to the dialout group: sudo usermod -aG dialout $USER, then logout and log back in
"Timeout waiting for motors": Check robot power supply — motors must be powered on before CAN communication works
Only detecting 5/6 motors: Check the CAN bus connector at the missing joint — the wire may be loose

Step 2: Install LeRobot with Damiao Motor Support

LeRobot is HuggingFace's open-source framework for robot manipulation. The latest version (v0.5+) supports OpenArm and Damiao motors natively.

# Install LeRobot from PyPI
pip install lerobot

# Install Damiao motor driver
pip install -e ".[damiao]"

Why .[damiao]? Damiao motors use a proprietary CAN protocol different from Dynamixel (used on SO-100/Koch). LeRobot separates motor drivers into optional dependencies to keep the main package lightweight. The [damiao] flag installs the pydamiao library for CAN bus communication with the motors.

Verify successful installation:

python -c "import lerobot; print(lerobot.__version__)"
# Output: 0.5.x or newer

python -c "from lerobot.common.robots.openarm import OpenArmFollower; print('OpenArm OK')"
# Output: OpenArm OK

Step 3: Calibrate Follower and Leader

Calibration is an extremely important step that many people skip or rush through. Calibration determines the "zero" position and movement limits of each joint. If calibration is wrong, the robot will move differently from what you intend — and an AI model trained on bad data will produce terrible results.

Calibrate the Follower (Task Arm)

lerobot-calibrate \
  --robot.type=openarm_follower \
  --robot.port=can0 \
  --robot.side=right \
  --robot.id=my_follower

The script will ask you to move the robot to reference positions. For OpenArm, the typical procedure is:

Home position: Move the robot to a vertical pose, all joints at 0 degrees
Joint limits: Move each joint to its min/max limits
Gripper calibration: Open fully, then close fully

Calibration results are saved to ~/.cache/lerobot/calibration/my_follower.json. This file contains angle offsets and joint limits — keep this file and you will not need to recalibrate unless you physically disassemble the robot.

Calibrate the Leader (Control Arm)

lerobot-calibrate \
  --teleop.type=openarm_leader \
  --teleop.port=can1 \
  --teleop.id=my_leader

The process is similar to the follower. The key difference: the leader typically has gravity compensation disabled (motors do not hold joints) so the human operator can move freely and directly feel the required forces.

Robot joint calibration and testing process

Calibration tips from real-world experience:

Calibrate on a flat, fixed surface — if the table vibrates, results will be noisy
Use a protractor or digital level for more accurate home positioning
Calibrate both arms in the same session — ensures consistent coordinate systems
After calibration, test by moving the leader. The follower should mirror exactly. If there is a small offset, recalibrate

Step 4: Test Teleoperation — Practice Before Recording

Teleoperation is the process where you (the human) control the follower through the leader. This is your "practice run" before officially recording data.

lerobot-teleoperate \
  --robot.type=openarm_follower \
  --robot.port=can0 \
  --robot.side=right \
  --robot.id=my_follower \
  --teleop.type=openarm_leader \
  --teleop.port=can1 \
  --teleop.id=my_leader

When you run this command, the follower will mirror every movement of the leader in real time. You move the leader by hand, and the follower follows.

Why Practice Matters

Collecting data for robot learning is not like recording a video. Each episode must be:

Consistent: Same task, performed with a similar pattern (not wildly different between episodes)
Smooth: Fluid motion without jerky movements (jittery motion creates action noise)
Complete: Start from a ready position, perform the task, end at a success position

Spend 15-20 minutes practicing the box-grasping motion before recording:

Approach the box from above (top grasp) — easiest for parallel grippers
Approach the box from the side (side grasp) — harder but more versatile
Try different box sizes
Practice: open gripper, lower down, close gripper, lift up, hold steady

Important tip: Keep movement speed moderate. Too fast and the follower cannot keep up (latency ~20-50ms at 30fps). Too slow and episodes become long, wasting disk space and training time.

Step 5: Record the Dataset — 50 Box-Grasping Episodes

This is the main event. LeRobot provides the lerobot-record script that automates the entire process: recording joint positions plus camera frames and packaging them into a standardized dataset.

lerobot-record \
  --robot.type=openarm_follower \
  --robot.port=can0 \
  --robot.side=right \
  --robot.id=my_follower \
  --robot.cameras="{ top: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}}" \
  --teleop.type=openarm_leader \
  --teleop.port=can1 \
  --teleop.id=my_leader \
  --dataset.repo_id=username/openarm-box-grasp \
  --dataset.single_task="Grasp the carton box and lift it" \
  --dataset.num_episodes=50 \
  --dataset.fps=30

Let us break down each parameter:

--robot.cameras: Camera configuration. top is the camera name (you can choose any name). index_or_path: 0 is the default webcam. 640x480 at 30fps is sufficient for most tasks.
--dataset.repo_id: Dataset name on HuggingFace Hub. Format: username/dataset-name. Replace username with your HuggingFace username.
--dataset.single_task: A brief English description of the task. This is the language instruction the VLA model will receive during inference.
--dataset.num_episodes: Number of episodes to collect. 50 is the minimum for fine-tuning — more (100-200) will produce better results.
--dataset.fps: Sampling frequency. 30fps means 30 (state, action, image) tuples recorded per second.

Workflow for Each Episode

When running lerobot-record, the terminal displays a control interface:

Press Enter to start a new episode
Perform the task: Use the leader to control the follower, grasping a carton box and lifting it
Press S to end the episode (success) or Press F to mark it as failed (will be discarded)
Reset the box to a new position, then return to step 1

Each episode typically lasts 5-15 seconds (150-450 frames at 30fps). Total time for 50 episodes: approximately 30-45 minutes including reset time.

Smart Allocation of 50 Episodes

Do not grasp the same box in the same position 50 times — the model will overfit. Instead, diversify:

Variation	Episodes	Details
Small box (10-15cm)	15	5 positions x 3 approach angles
Medium box (15-25cm)	20	5 positions x 4 approach angles
Large box (25-35cm)	10	5 positions x 2 approach angles
Failure recovery	5	Intentionally miss, then correct and grasp successfully

Why include recovery episodes? In practice, the robot will sometimes miss a grasp. If training data only contains "perfect grasps," the model will not know how to handle a miss — leading to failure cascades. Having 5-10% recovery episodes teaches the model self-correction.

Data Quality Tips — What Makes the Difference

Data collection is an art, not just science. Here are tips from real-world experience that documentation rarely mentions:

1. Consistent Lighting

The camera is the robot's "eyes." If lighting changes between episodes (clouds blocking sun, lights turning on/off), the model will be confused. Solutions:

Use fixed LED lights instead of natural light
Avoid shadows on the workspace — use diffused lighting
If using natural light, collect data at the same time of day consistently

2. Clean Background

The workspace should be simple, with minimal distracting objects. The robot is learning to grasp carton boxes — if the table also has water bottles, phones, books... the model may attend to irrelevant objects. Keep only carton boxes plus the table surface.

3. Fixed Camera Angle

Do not move the camera between episodes. Mount it securely or use a tripod. If the camera shifts even a few millimeters, pixel-level features change and the model needs more data to generalize.

4. Record Metadata Notes

Create a simple notes file:

Episodes 1-15: Small red box, center to right positions
Episodes 16-35: Medium brown box, left to center positions
Episodes 36-45: Large white box, random positions
Episodes 46-50: Recovery (intentional miss → correct → grasp)

This metadata is invaluable when debugging model performance later — knowing which episodes contain which variations helps analyze failure modes.

Clean workspace with good lighting for robot data collection

Using Community Data from HuggingFace Hub

One of the biggest advantages of LeRobot is the community data ecosystem. You do not necessarily have to collect everything from scratch.

Browse Available Datasets

Visit HuggingFace Hub and search with the keyword lerobot. You will find hundreds of community datasets:

lerobot/so100_pick_place — Pick-and-place dataset with SO-100 arm
lerobot/koch_push_cube — Cube pushing with Koch arm
And many more for diverse tasks

Cross-Embodiment Transfer Learning

This is the strength of VLA models like SmolVLA. SmolVLA is pretrained on data from multiple robot types (SO-100, Koch, Franka) — it has already learned general manipulation priors: how to approach objects, how to open/close grippers, how to lift objects stably.

When you fine-tune SmolVLA on 50 OpenArm episodes, it does not have to learn from scratch. Instead:

Pretrained knowledge (from thousands of episodes on SO-100/Koch) provides general manipulation "sense"
Fine-tune data (your 50 OpenArm episodes) teaches the model how to adapt to OpenArm's specific kinematics
Result: high data efficiency — 50 episodes are enough to reach 70-85% success rate

In contrast, if you train ACT from scratch (no pretrained model), 50 episodes only yield about 40-60% success rate. You would need 200-500 episodes to match the same performance.

Download and Explore Community Datasets

# Download a community dataset
python -c "
from lerobot.common.datasets.lerobot_dataset import LeRobotDataset
ds = LeRobotDataset('lerobot/so100_pick_place')
print(f'Dataset: {len(ds)} frames, {ds.num_episodes} episodes')
print(f'Tasks: {ds.tasks}')
"

You can use community datasets to pretrain and then fine-tune on your own OpenArm data — this approach is called staged training and typically produces better results than direct training.

Data Format: LeRobotDataset

LeRobot stores data in the standardized LeRobotDataset format, designed and optimized for robot learning:

Directory Structure

openarm-box-grasp/
├── meta/
│   ├── info.json          # Metadata: fps, robot type, task description
│   ├── episodes.jsonl     # Per-episode info (timestamps, success/fail)
│   └── tasks.jsonl        # List of task descriptions
├── data/
│   ├── chunk-000/
│   │   ├── episode_000000.parquet  # Joint states + actions
│   │   ├── episode_000001.parquet
│   │   └── ...
├── videos/
│   ├── chunk-000/
│   │   ├── top/
│   │   │   ├── episode_000000.mp4  # Camera frames
│   │   │   ├── episode_000001.mp4
│   │   │   └── ...

What Do the Parquet Files Contain?

Each Parquet file contains a table with these columns:

Column	Description	Shape
`state`	Current joint positions	(6,) for 6-DOF
`action`	Target joint positions (command)	(6,)
`timestamp`	Time within the episode	scalar
`episode_index`	Episode index	scalar
`frame_index`	Frame index within episode	scalar
`task_index`	Task index	scalar

Automatic Upload to HuggingFace Hub

After collection finishes, lerobot-record automatically uploads the dataset to HuggingFace Hub. You need to log in first:

huggingface-cli login
# Enter your token from https://huggingface.co/settings/tokens

The dataset will be available at: https://huggingface.co/datasets/username/openarm-box-grasp

You can visualize the dataset directly on the Hub — watch episode videos, check action distributions, and spot outliers.

Dataset Quality Check

After collection, verify quality before training:

from lerobot.common.datasets.lerobot_dataset import LeRobotDataset
import numpy as np

ds = LeRobotDataset("username/openarm-box-grasp")

print(f"Total episodes: {ds.num_episodes}")
print(f"Total frames: {len(ds)}")
print(f"Average episode length: {len(ds) / ds.num_episodes:.0f} frames")

# Check action range
actions = np.array([ds[i]["action"].numpy() for i in range(len(ds))])
print(f"Action range: [{actions.min():.3f}, {actions.max():.3f}]")
print(f"Action std: {actions.std(axis=0)}")

# Check for episodes that are too short or too long
for ep_idx in range(ds.num_episodes):
    ep_len = len(ds.get_episode(ep_idx))
    if ep_len < 30:  # < 1 second at 30fps
        print(f"WARNING: Episode {ep_idx} too short ({ep_len} frames)")
    if ep_len > 900:  # > 30 seconds
        print(f"WARNING: Episode {ep_idx} too long ({ep_len} frames)")

Red flags to watch for:

Episode too short (<1s): You may have pressed end too early
Episode too long (>30s): You may have forgotten to press end or moved too slowly
Action std near 0 for a joint: That joint did not move — check calibration
Action range outside limits: Calibration error

Summary and Next Steps

You have completed the most important part of the pipeline: collecting 50 box-grasping episodes on OpenArm. Here is a recap of what was accomplished:

Set up CAN bus connecting the robot to the computer
Installed LeRobot with Damiao motor support
Calibrated both follower and leader arms
Practiced teleoperation
Recorded 50 diverse episodes (varying sizes, positions, approach angles)
Verified dataset quality

In the next post — SimpleVLA-RL (8): Training & Deploying on OpenArm — we will use this dataset to train 3 different policies (ACT, SmolVLA, Pi0-FAST), deploy on the real robot, and improve performance with HIL-SERL. That is when the data you just collected truly shows its value.

If you want a deeper understanding of the VLA training process, read part 3 of this series. And if you are interested in how HIL-SERL improves policies on real robots, that post provides a comprehensive overview.

SimpleVLA-RL (1): Framework Overview — Understand the complete pipeline from SFT to RL on VLA models
SmolVLA Training with LeRobot — Detailed guide to fine-tuning SmolVLA 450M on robot data
HIL-SERL: RL on Real Robots — Improving policies with human-in-the-loop reinforcement learning

Collecting Data for OpenArm: From Unboxing to 50 Box-Grasping Episodes

Hardware Checklist: What Do You Need?

Before starting, verify you have all the required equipment:

Equipment	Description	Estimated Price
OpenArm follower	Robot arm that performs the task — 6 DOF with Damiao motors	~$3,500 (single arm)
OpenArm leader	Control arm for teleoperation — same configuration	Included in bimanual kit (~$6,500)
CAN bus USB adapter	Connects robot to computer via CAN protocol	~$20-50 (CANable, PCAN-USB)
Linux PC with GPU	Ubuntu 22.04+, NVIDIA GPU (minimum RTX 3060)	Your existing machine
USB camera	Standard webcam, 640x480 or higher, 30fps	~$30-100 (Logitech C920+)
Carton boxes	Various sizes (small, medium, large)	Free — use shipping boxes
Workbench	Flat, stable surface with enough room for robot operation	Your existing desk

OpenArm robot arm setup on workbench

Step 1: CAN Bus Setup — Connecting Robot to Computer

Wiring

Connect CAN USB adapter #1 to a USB port, then wire CAN-H and CAN-L to the follower arm
Connect CAN USB adapter #2 to a different USB port, then wire CAN to the leader arm
Ensure there is a 120 ohm termination resistor at the end of each bus (many adapters have one built in — check the datasheet)

Interface Configuration

LeRobot provides a built-in CAN setup script. Run these commands:

# Setup CAN interfaces (can0 for follower, can1 for leader)
lerobot-setup-can --mode=setup --interfaces=can0,can1

This command will:

Load the gs_usb (or can_raw) kernel module
Set the bitrate appropriate for Damiao motors (typically 1Mbps)
Bring up the can0 and can1 interfaces

Verify Connection

# Test whether CAN bus is working
lerobot-setup-can --mode=test --interfaces=can0,can1

If successful, you will see output like:

[OK] can0: Detected 6 motors (Damiao DM-J4340-2EC)
[OK] can1: Detected 6 motors (Damiao DM-J4340-2EC)
All interfaces ready!

Common troubleshooting:

"No device found": Check USB cables, try a different USB port
"Permission denied": Add your user to the dialout group: sudo usermod -aG dialout $USER, then logout and log back in
"Timeout waiting for motors": Check robot power supply — motors must be powered on before CAN communication works
Only detecting 5/6 motors: Check the CAN bus connector at the missing joint — the wire may be loose

Step 2: Install LeRobot with Damiao Motor Support

LeRobot is HuggingFace's open-source framework for robot manipulation. The latest version (v0.5+) supports OpenArm and Damiao motors natively.

# Install LeRobot from PyPI
pip install lerobot

# Install Damiao motor driver
pip install -e ".[damiao]"

Verify successful installation:

python -c "import lerobot; print(lerobot.__version__)"
# Output: 0.5.x or newer

python -c "from lerobot.common.robots.openarm import OpenArmFollower; print('OpenArm OK')"
# Output: OpenArm OK

Step 3: Calibrate Follower and Leader

Calibrate the Follower (Task Arm)

lerobot-calibrate \
  --robot.type=openarm_follower \
  --robot.port=can0 \
  --robot.side=right \
  --robot.id=my_follower

The script will ask you to move the robot to reference positions. For OpenArm, the typical procedure is:

Home position: Move the robot to a vertical pose, all joints at 0 degrees
Joint limits: Move each joint to its min/max limits
Gripper calibration: Open fully, then close fully

Calibrate the Leader (Control Arm)

lerobot-calibrate \
  --teleop.type=openarm_leader \
  --teleop.port=can1 \
  --teleop.id=my_leader

Robot joint calibration and testing process

Calibration tips from real-world experience:

Calibrate on a flat, fixed surface — if the table vibrates, results will be noisy
Use a protractor or digital level for more accurate home positioning
Calibrate both arms in the same session — ensures consistent coordinate systems
After calibration, test by moving the leader. The follower should mirror exactly. If there is a small offset, recalibrate

Step 4: Test Teleoperation — Practice Before Recording

Teleoperation is the process where you (the human) control the follower through the leader. This is your "practice run" before officially recording data.

lerobot-teleoperate \
  --robot.type=openarm_follower \
  --robot.port=can0 \
  --robot.side=right \
  --robot.id=my_follower \
  --teleop.type=openarm_leader \
  --teleop.port=can1 \
  --teleop.id=my_leader

When you run this command, the follower will mirror every movement of the leader in real time. You move the leader by hand, and the follower follows.

Why Practice Matters

Collecting data for robot learning is not like recording a video. Each episode must be:

Consistent: Same task, performed with a similar pattern (not wildly different between episodes)
Smooth: Fluid motion without jerky movements (jittery motion creates action noise)
Complete: Start from a ready position, perform the task, end at a success position

Spend 15-20 minutes practicing the box-grasping motion before recording:

Approach the box from above (top grasp) — easiest for parallel grippers
Approach the box from the side (side grasp) — harder but more versatile
Try different box sizes
Practice: open gripper, lower down, close gripper, lift up, hold steady

Important tip: Keep movement speed moderate. Too fast and the follower cannot keep up (latency ~20-50ms at 30fps). Too slow and episodes become long, wasting disk space and training time.

Step 5: Record the Dataset — 50 Box-Grasping Episodes

This is the main event. LeRobot provides the lerobot-record script that automates the entire process: recording joint positions plus camera frames and packaging them into a standardized dataset.

lerobot-record \
  --robot.type=openarm_follower \
  --robot.port=can0 \
  --robot.side=right \
  --robot.id=my_follower \
  --robot.cameras="{ top: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}}" \
  --teleop.type=openarm_leader \
  --teleop.port=can1 \
  --teleop.id=my_leader \
  --dataset.repo_id=username/openarm-box-grasp \
  --dataset.single_task="Grasp the carton box and lift it" \
  --dataset.num_episodes=50 \
  --dataset.fps=30

Let us break down each parameter:

--robot.cameras: Camera configuration. top is the camera name (you can choose any name). index_or_path: 0 is the default webcam. 640x480 at 30fps is sufficient for most tasks.
--dataset.repo_id: Dataset name on HuggingFace Hub. Format: username/dataset-name. Replace username with your HuggingFace username.
--dataset.single_task: A brief English description of the task. This is the language instruction the VLA model will receive during inference.
--dataset.num_episodes: Number of episodes to collect. 50 is the minimum for fine-tuning — more (100-200) will produce better results.
--dataset.fps: Sampling frequency. 30fps means 30 (state, action, image) tuples recorded per second.

Workflow for Each Episode

When running lerobot-record, the terminal displays a control interface:

Press Enter to start a new episode
Perform the task: Use the leader to control the follower, grasping a carton box and lifting it
Press S to end the episode (success) or Press F to mark it as failed (will be discarded)
Reset the box to a new position, then return to step 1

Each episode typically lasts 5-15 seconds (150-450 frames at 30fps). Total time for 50 episodes: approximately 30-45 minutes including reset time.

Smart Allocation of 50 Episodes

Do not grasp the same box in the same position 50 times — the model will overfit. Instead, diversify:

Variation	Episodes	Details
Small box (10-15cm)	15	5 positions x 3 approach angles
Medium box (15-25cm)	20	5 positions x 4 approach angles
Large box (25-35cm)	10	5 positions x 2 approach angles
Failure recovery	5	Intentionally miss, then correct and grasp successfully

Data Quality Tips — What Makes the Difference

Data collection is an art, not just science. Here are tips from real-world experience that documentation rarely mentions:

1. Consistent Lighting

The camera is the robot's "eyes." If lighting changes between episodes (clouds blocking sun, lights turning on/off), the model will be confused. Solutions:

Use fixed LED lights instead of natural light
Avoid shadows on the workspace — use diffused lighting
If using natural light, collect data at the same time of day consistently

2. Clean Background

3. Fixed Camera Angle

Do not move the camera between episodes. Mount it securely or use a tripod. If the camera shifts even a few millimeters, pixel-level features change and the model needs more data to generalize.

4. Record Metadata Notes

Create a simple notes file:

Episodes 1-15: Small red box, center to right positions
Episodes 16-35: Medium brown box, left to center positions
Episodes 36-45: Large white box, random positions
Episodes 46-50: Recovery (intentional miss → correct → grasp)

This metadata is invaluable when debugging model performance later — knowing which episodes contain which variations helps analyze failure modes.

Clean workspace with good lighting for robot data collection

Using Community Data from HuggingFace Hub

One of the biggest advantages of LeRobot is the community data ecosystem. You do not necessarily have to collect everything from scratch.

Browse Available Datasets

Visit HuggingFace Hub and search with the keyword lerobot. You will find hundreds of community datasets:

lerobot/so100_pick_place — Pick-and-place dataset with SO-100 arm
lerobot/koch_push_cube — Cube pushing with Koch arm
And many more for diverse tasks

Cross-Embodiment Transfer Learning

When you fine-tune SmolVLA on 50 OpenArm episodes, it does not have to learn from scratch. Instead:

Pretrained knowledge (from thousands of episodes on SO-100/Koch) provides general manipulation "sense"
Fine-tune data (your 50 OpenArm episodes) teaches the model how to adapt to OpenArm's specific kinematics
Result: high data efficiency — 50 episodes are enough to reach 70-85% success rate

In contrast, if you train ACT from scratch (no pretrained model), 50 episodes only yield about 40-60% success rate. You would need 200-500 episodes to match the same performance.

Download and Explore Community Datasets

# Download a community dataset
python -c "
from lerobot.common.datasets.lerobot_dataset import LeRobotDataset
ds = LeRobotDataset('lerobot/so100_pick_place')
print(f'Dataset: {len(ds)} frames, {ds.num_episodes} episodes')
print(f'Tasks: {ds.tasks}')
"

You can use community datasets to pretrain and then fine-tune on your own OpenArm data — this approach is called staged training and typically produces better results than direct training.

Data Format: LeRobotDataset

LeRobot stores data in the standardized LeRobotDataset format, designed and optimized for robot learning:

Directory Structure

openarm-box-grasp/
├── meta/
│   ├── info.json          # Metadata: fps, robot type, task description
│   ├── episodes.jsonl     # Per-episode info (timestamps, success/fail)
│   └── tasks.jsonl        # List of task descriptions
├── data/
│   ├── chunk-000/
│   │   ├── episode_000000.parquet  # Joint states + actions
│   │   ├── episode_000001.parquet
│   │   └── ...
├── videos/
│   ├── chunk-000/
│   │   ├── top/
│   │   │   ├── episode_000000.mp4  # Camera frames
│   │   │   ├── episode_000001.mp4
│   │   │   └── ...

What Do the Parquet Files Contain?

Each Parquet file contains a table with these columns:

Column	Description	Shape
`state`	Current joint positions	(6,) for 6-DOF
`action`	Target joint positions (command)	(6,)
`timestamp`	Time within the episode	scalar
`episode_index`	Episode index	scalar
`frame_index`	Frame index within episode	scalar
`task_index`	Task index	scalar

Automatic Upload to HuggingFace Hub

After collection finishes, lerobot-record automatically uploads the dataset to HuggingFace Hub. You need to log in first:

huggingface-cli login
# Enter your token from https://huggingface.co/settings/tokens

The dataset will be available at: https://huggingface.co/datasets/username/openarm-box-grasp

You can visualize the dataset directly on the Hub — watch episode videos, check action distributions, and spot outliers.

Dataset Quality Check

After collection, verify quality before training:

from lerobot.common.datasets.lerobot_dataset import LeRobotDataset
import numpy as np

ds = LeRobotDataset("username/openarm-box-grasp")

print(f"Total episodes: {ds.num_episodes}")
print(f"Total frames: {len(ds)}")
print(f"Average episode length: {len(ds) / ds.num_episodes:.0f} frames")

# Check action range
actions = np.array([ds[i]["action"].numpy() for i in range(len(ds))])
print(f"Action range: [{actions.min():.3f}, {actions.max():.3f}]")
print(f"Action std: {actions.std(axis=0)}")

# Check for episodes that are too short or too long
for ep_idx in range(ds.num_episodes):
    ep_len = len(ds.get_episode(ep_idx))
    if ep_len < 30:  # < 1 second at 30fps
        print(f"WARNING: Episode {ep_idx} too short ({ep_len} frames)")
    if ep_len > 900:  # > 30 seconds
        print(f"WARNING: Episode {ep_idx} too long ({ep_len} frames)")

Red flags to watch for:

Episode too short (<1s): You may have pressed end too early
Episode too long (>30s): You may have forgotten to press end or moved too slowly
Action std near 0 for a joint: That joint did not move — check calibration
Action range outside limits: Calibration error

Summary and Next Steps

You have completed the most important part of the pipeline: collecting 50 box-grasping episodes on OpenArm. Here is a recap of what was accomplished:

Set up CAN bus connecting the robot to the computer
Installed LeRobot with Damiao motor support
Calibrated both follower and leader arms
Practiced teleoperation
Recorded 50 diverse episodes (varying sizes, positions, approach angles)
Verified dataset quality

SimpleVLA-RL (1): Framework Overview — Understand the complete pipeline from SFT to RL on VLA models
SmolVLA Training with LeRobot — Detailed guide to fine-tuning SmolVLA 450M on robot data
HIL-SERL: RL on Real Robots — Improving policies with human-in-the-loop reinforcement learning

Collecting Data for OpenArm: From Unboxing to 50 Box-Grasping Episodes

Hardware Checklist: What Do You Need?

Step 1: CAN Bus Setup — Connecting Robot to Computer

Wiring

Interface Configuration

Verify Connection

Step 2: Install LeRobot with Damiao Motor Support

Step 3: Calibrate Follower and Leader

Calibrate the Follower (Task Arm)

Calibrate the Leader (Control Arm)

Step 4: Test Teleoperation — Practice Before Recording

Why Practice Matters

Step 5: Record the Dataset — 50 Box-Grasping Episodes

Workflow for Each Episode

Smart Allocation of 50 Episodes

Data Quality Tips — What Makes the Difference

1. Consistent Lighting

2. Clean Background

3. Fixed Camera Angle

4. Record Metadata Notes

Using Community Data from HuggingFace Hub

Browse Available Datasets

Cross-Embodiment Transfer Learning

Download and Explore Community Datasets

Data Format: LeRobotDataset

Directory Structure

What Do the Parquet Files Contain?

Automatic Upload to HuggingFace Hub

Dataset Quality Check

Summary and Next Steps

Related Posts

Nguyễn Anh Tuấn

Related Posts

SimpleVLA-RL (6): OpenArm — Phân tích Lộ trình

SimpleVLA-RL (9): OpenArm Simulation & Data

Pilot 2 người cho dữ liệu humanoid VLA

Collecting Data for OpenArm: From Unboxing to 50 Box-Grasping Episodes

Hardware Checklist: What Do You Need?

Step 1: CAN Bus Setup — Connecting Robot to Computer

Wiring

Interface Configuration

Verify Connection

Step 2: Install LeRobot with Damiao Motor Support

Step 3: Calibrate Follower and Leader

Calibrate the Follower (Task Arm)

Calibrate the Leader (Control Arm)

Step 4: Test Teleoperation — Practice Before Recording

Why Practice Matters

Step 5: Record the Dataset — 50 Box-Grasping Episodes

Workflow for Each Episode

Smart Allocation of 50 Episodes

Data Quality Tips — What Makes the Difference

1. Consistent Lighting

2. Clean Background

3. Fixed Camera Angle

4. Record Metadata Notes

Using Community Data from HuggingFace Hub

Browse Available Datasets

Cross-Embodiment Transfer Learning

Download and Explore Community Datasets

Data Format: LeRobotDataset

Directory Structure

What Do the Parquet Files Contain?

Automatic Upload to HuggingFace Hub

Dataset Quality Check

Summary and Next Steps

Related Posts

Nguyễn Anh Tuấn

Related Posts

SimpleVLA-RL (6): OpenArm — Phân tích Lộ trình

SimpleVLA-RL (9): OpenArm Simulation & Data

Pilot 2 người cho dữ liệu humanoid VLA