← Back to Blog
aiopenarmlerobotdata-collectionteleoperation

SimpleVLA-RL (7): Collecting Data for OpenArm

Step-by-step guide to setting up OpenArm, calibrating, teleoperating, and collecting 50 box-grasping episodes with LeRobot.

Nguyễn Anh Tuấn11 tháng 4, 202614 min read
SimpleVLA-RL (7): Collecting Data for OpenArm

Collecting Data for OpenArm: From Unboxing to 50 Box-Grasping Episodes

You already understand the theory behind SimpleVLA-RL — a framework that uses Reinforcement Learning to improve VLA models in simulation. You also know why OpenArm is the ideal hardware platform for manipulation experiments. Now it is time for the most critical hands-on step: data collection.

In robotics, data is fuel. Even the most powerful AI model is useless without quality data to learn from. This post walks you through every step — from plugging in CAN bus wires to having 50 box-grasping episodes ready for upload to HuggingFace Hub. If you are a beginner, do not worry — everything is explained in detail, with practical tips that only come from real-world experience.

Hardware Checklist: What Do You Need?

Before starting, verify you have all the required equipment:

Equipment Description Estimated Price
OpenArm follower Robot arm that performs the task — 6 DOF with Damiao motors ~$3,500 (single arm)
OpenArm leader Control arm for teleoperation — same configuration Included in bimanual kit (~$6,500)
CAN bus USB adapter Connects robot to computer via CAN protocol ~$20-50 (CANable, PCAN-USB)
Linux PC with GPU Ubuntu 22.04+, NVIDIA GPU (minimum RTX 3060) Your existing machine
USB camera Standard webcam, 640x480 or higher, 30fps ~$30-100 (Logitech C920+)
Carton boxes Various sizes (small, medium, large) Free — use shipping boxes
Workbench Flat, stable surface with enough room for robot operation Your existing desk

Important note: If you only have one arm (no leader), you can still collect data using gamepad or keyboard teleoperation. However, leader-follower teleoperation produces significantly better data quality because the motions are more natural and fluid.

OpenArm robot arm setup on workbench

Step 1: CAN Bus Setup — Connecting Robot to Computer

CAN bus (Controller Area Network) is the communication protocol OpenArm uses between the computer and Damiao motors. If you have worked with automotive or industrial systems, you are already familiar with CAN. If not — think of it as a "data highway" connecting the computer to each robot joint.

Wiring

  1. Connect CAN USB adapter #1 to a USB port, then wire CAN-H and CAN-L to the follower arm
  2. Connect CAN USB adapter #2 to a different USB port, then wire CAN to the leader arm
  3. Ensure there is a 120 ohm termination resistor at the end of each bus (many adapters have one built in — check the datasheet)

Interface Configuration

LeRobot provides a built-in CAN setup script. Run these commands:

# Setup CAN interfaces (can0 for follower, can1 for leader)
lerobot-setup-can --mode=setup --interfaces=can0,can1

This command will:

Verify Connection

# Test whether CAN bus is working
lerobot-setup-can --mode=test --interfaces=can0,can1

If successful, you will see output like:

[OK] can0: Detected 6 motors (Damiao DM-J4340-2EC)
[OK] can1: Detected 6 motors (Damiao DM-J4340-2EC)
All interfaces ready!

Common troubleshooting:

Step 2: Install LeRobot with Damiao Motor Support

LeRobot is HuggingFace's open-source framework for robot manipulation. The latest version (v0.5+) supports OpenArm and Damiao motors natively.

# Install LeRobot from PyPI
pip install lerobot

# Install Damiao motor driver
pip install -e ".[damiao]"

Why .[damiao]? Damiao motors use a proprietary CAN protocol different from Dynamixel (used on SO-100/Koch). LeRobot separates motor drivers into optional dependencies to keep the main package lightweight. The [damiao] flag installs the pydamiao library for CAN bus communication with the motors.

Verify successful installation:

python -c "import lerobot; print(lerobot.__version__)"
# Output: 0.5.x or newer

python -c "from lerobot.common.robots.openarm import OpenArmFollower; print('OpenArm OK')"
# Output: OpenArm OK

Step 3: Calibrate Follower and Leader

Calibration is an extremely important step that many people skip or rush through. Calibration determines the "zero" position and movement limits of each joint. If calibration is wrong, the robot will move differently from what you intend — and an AI model trained on bad data will produce terrible results.

Calibrate the Follower (Task Arm)

lerobot-calibrate \
  --robot.type=openarm_follower \
  --robot.port=can0 \
  --robot.side=right \
  --robot.id=my_follower

The script will ask you to move the robot to reference positions. For OpenArm, the typical procedure is:

  1. Home position: Move the robot to a vertical pose, all joints at 0 degrees
  2. Joint limits: Move each joint to its min/max limits
  3. Gripper calibration: Open fully, then close fully

Calibration results are saved to ~/.cache/lerobot/calibration/my_follower.json. This file contains angle offsets and joint limits — keep this file and you will not need to recalibrate unless you physically disassemble the robot.

Calibrate the Leader (Control Arm)

lerobot-calibrate \
  --teleop.type=openarm_leader \
  --teleop.port=can1 \
  --teleop.id=my_leader

The process is similar to the follower. The key difference: the leader typically has gravity compensation disabled (motors do not hold joints) so the human operator can move freely and directly feel the required forces.

Robot joint calibration and testing process

Calibration tips from real-world experience:

Step 4: Test Teleoperation — Practice Before Recording

Teleoperation is the process where you (the human) control the follower through the leader. This is your "practice run" before officially recording data.

lerobot-teleoperate \
  --robot.type=openarm_follower \
  --robot.port=can0 \
  --robot.side=right \
  --robot.id=my_follower \
  --teleop.type=openarm_leader \
  --teleop.port=can1 \
  --teleop.id=my_leader

When you run this command, the follower will mirror every movement of the leader in real time. You move the leader by hand, and the follower follows.

Why Practice Matters

Collecting data for robot learning is not like recording a video. Each episode must be:

Spend 15-20 minutes practicing the box-grasping motion before recording:

Important tip: Keep movement speed moderate. Too fast and the follower cannot keep up (latency ~20-50ms at 30fps). Too slow and episodes become long, wasting disk space and training time.

Step 5: Record the Dataset — 50 Box-Grasping Episodes

This is the main event. LeRobot provides the lerobot-record script that automates the entire process: recording joint positions plus camera frames and packaging them into a standardized dataset.

lerobot-record \
  --robot.type=openarm_follower \
  --robot.port=can0 \
  --robot.side=right \
  --robot.id=my_follower \
  --robot.cameras="{ top: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}}" \
  --teleop.type=openarm_leader \
  --teleop.port=can1 \
  --teleop.id=my_leader \
  --dataset.repo_id=username/openarm-box-grasp \
  --dataset.single_task="Grasp the carton box and lift it" \
  --dataset.num_episodes=50 \
  --dataset.fps=30

Let us break down each parameter:

Workflow for Each Episode

When running lerobot-record, the terminal displays a control interface:

  1. Press Enter to start a new episode
  2. Perform the task: Use the leader to control the follower, grasping a carton box and lifting it
  3. Press S to end the episode (success) or Press F to mark it as failed (will be discarded)
  4. Reset the box to a new position, then return to step 1

Each episode typically lasts 5-15 seconds (150-450 frames at 30fps). Total time for 50 episodes: approximately 30-45 minutes including reset time.

Smart Allocation of 50 Episodes

Do not grasp the same box in the same position 50 times — the model will overfit. Instead, diversify:

Variation Episodes Details
Small box (10-15cm) 15 5 positions x 3 approach angles
Medium box (15-25cm) 20 5 positions x 4 approach angles
Large box (25-35cm) 10 5 positions x 2 approach angles
Failure recovery 5 Intentionally miss, then correct and grasp successfully

Why include recovery episodes? In practice, the robot will sometimes miss a grasp. If training data only contains "perfect grasps," the model will not know how to handle a miss — leading to failure cascades. Having 5-10% recovery episodes teaches the model self-correction.

Data Quality Tips — What Makes the Difference

Data collection is an art, not just science. Here are tips from real-world experience that documentation rarely mentions:

1. Consistent Lighting

The camera is the robot's "eyes." If lighting changes between episodes (clouds blocking sun, lights turning on/off), the model will be confused. Solutions:

2. Clean Background

The workspace should be simple, with minimal distracting objects. The robot is learning to grasp carton boxes — if the table also has water bottles, phones, books... the model may attend to irrelevant objects. Keep only carton boxes plus the table surface.

3. Fixed Camera Angle

Do not move the camera between episodes. Mount it securely or use a tripod. If the camera shifts even a few millimeters, pixel-level features change and the model needs more data to generalize.

4. Record Metadata Notes

Create a simple notes file:

Episodes 1-15: Small red box, center to right positions
Episodes 16-35: Medium brown box, left to center positions
Episodes 36-45: Large white box, random positions
Episodes 46-50: Recovery (intentional miss → correct → grasp)

This metadata is invaluable when debugging model performance later — knowing which episodes contain which variations helps analyze failure modes.

Clean workspace with good lighting for robot data collection

Using Community Data from HuggingFace Hub

One of the biggest advantages of LeRobot is the community data ecosystem. You do not necessarily have to collect everything from scratch.

Browse Available Datasets

Visit HuggingFace Hub and search with the keyword lerobot. You will find hundreds of community datasets:

Cross-Embodiment Transfer Learning

This is the strength of VLA models like SmolVLA. SmolVLA is pretrained on data from multiple robot types (SO-100, Koch, Franka) — it has already learned general manipulation priors: how to approach objects, how to open/close grippers, how to lift objects stably.

When you fine-tune SmolVLA on 50 OpenArm episodes, it does not have to learn from scratch. Instead:

  1. Pretrained knowledge (from thousands of episodes on SO-100/Koch) provides general manipulation "sense"
  2. Fine-tune data (your 50 OpenArm episodes) teaches the model how to adapt to OpenArm's specific kinematics
  3. Result: high data efficiency — 50 episodes are enough to reach 70-85% success rate

In contrast, if you train ACT from scratch (no pretrained model), 50 episodes only yield about 40-60% success rate. You would need 200-500 episodes to match the same performance.

Download and Explore Community Datasets

# Download a community dataset
python -c "
from lerobot.common.datasets.lerobot_dataset import LeRobotDataset
ds = LeRobotDataset('lerobot/so100_pick_place')
print(f'Dataset: {len(ds)} frames, {ds.num_episodes} episodes')
print(f'Tasks: {ds.tasks}')
"

You can use community datasets to pretrain and then fine-tune on your own OpenArm data — this approach is called staged training and typically produces better results than direct training.

Data Format: LeRobotDataset

LeRobot stores data in the standardized LeRobotDataset format, designed and optimized for robot learning:

Directory Structure

openarm-box-grasp/
├── meta/
│   ├── info.json          # Metadata: fps, robot type, task description
│   ├── episodes.jsonl     # Per-episode info (timestamps, success/fail)
│   └── tasks.jsonl        # List of task descriptions
├── data/
│   ├── chunk-000/
│   │   ├── episode_000000.parquet  # Joint states + actions
│   │   ├── episode_000001.parquet
│   │   └── ...
├── videos/
│   ├── chunk-000/
│   │   ├── top/
│   │   │   ├── episode_000000.mp4  # Camera frames
│   │   │   ├── episode_000001.mp4
│   │   │   └── ...

What Do the Parquet Files Contain?

Each Parquet file contains a table with these columns:

Column Description Shape
state Current joint positions (6,) for 6-DOF
action Target joint positions (command) (6,)
timestamp Time within the episode scalar
episode_index Episode index scalar
frame_index Frame index within episode scalar
task_index Task index scalar

Automatic Upload to HuggingFace Hub

After collection finishes, lerobot-record automatically uploads the dataset to HuggingFace Hub. You need to log in first:

huggingface-cli login
# Enter your token from https://huggingface.co/settings/tokens

The dataset will be available at: https://huggingface.co/datasets/username/openarm-box-grasp

You can visualize the dataset directly on the Hub — watch episode videos, check action distributions, and spot outliers.

Dataset Quality Check

After collection, verify quality before training:

from lerobot.common.datasets.lerobot_dataset import LeRobotDataset
import numpy as np

ds = LeRobotDataset("username/openarm-box-grasp")

print(f"Total episodes: {ds.num_episodes}")
print(f"Total frames: {len(ds)}")
print(f"Average episode length: {len(ds) / ds.num_episodes:.0f} frames")

# Check action range
actions = np.array([ds[i]["action"].numpy() for i in range(len(ds))])
print(f"Action range: [{actions.min():.3f}, {actions.max():.3f}]")
print(f"Action std: {actions.std(axis=0)}")

# Check for episodes that are too short or too long
for ep_idx in range(ds.num_episodes):
    ep_len = len(ds.get_episode(ep_idx))
    if ep_len < 30:  # < 1 second at 30fps
        print(f"WARNING: Episode {ep_idx} too short ({ep_len} frames)")
    if ep_len > 900:  # > 30 seconds
        print(f"WARNING: Episode {ep_idx} too long ({ep_len} frames)")

Red flags to watch for:

Summary and Next Steps

You have completed the most important part of the pipeline: collecting 50 box-grasping episodes on OpenArm. Here is a recap of what was accomplished:

  1. Set up CAN bus connecting the robot to the computer
  2. Installed LeRobot with Damiao motor support
  3. Calibrated both follower and leader arms
  4. Practiced teleoperation
  5. Recorded 50 diverse episodes (varying sizes, positions, approach angles)
  6. Verified dataset quality

In the next post — SimpleVLA-RL (8): Training & Deploying on OpenArm — we will use this dataset to train 3 different policies (ACT, SmolVLA, Pi0-FAST), deploy on the real robot, and improve performance with HIL-SERL. That is when the data you just collected truly shows its value.

If you want a deeper understanding of the VLA training process, read part 3 of this series. And if you are interested in how HIL-SERL improves policies on real robots, that post provides a comprehensive overview.


Related Posts

Related Posts

TutorialSimpleVLA-RL (10): SFT & RL Training cho OpenArm
openarmsimplevla-rltraininggrporeinforcement-learningPart 10

SimpleVLA-RL (10): SFT & RL Training cho OpenArm

Hướng dẫn chi tiết SFT fine-tuning và RL training với SimpleVLA-RL cho OpenArm — từ config environment đến chạy GRPO.

11/4/202616 min read
TutorialSimpleVLA-RL (11): Sim-to-Real cho OpenArm
openarmsim-to-realdeploymentsimplevla-rlPart 11

SimpleVLA-RL (11): Sim-to-Real cho OpenArm

Deploy model SimpleVLA-RL từ simulation lên OpenArm thật — camera setup, action mapping, và tips giảm sim-to-real gap.

11/4/202617 min read
ComparisonSimpleVLA-RL (5): So sánh với LeRobot
ai-perceptionvlareinforcement-learninglerobotresearchPart 5

SimpleVLA-RL (5): So sánh với LeRobot

So sánh chi tiết SimpleVLA-RL và LeRobot: RL approach, VLA models, sim vs real, data efficiency — hai framework bổ trợ nhau.

11/4/202612 min read