A planner is only as good as its world model
IK can run with only the robot model. Motion planning needs a world model. For a tabletop arm, start with cuboids for the table, bin, and fixtures. When people, unknown objects, or fast-changing scenes enter the workspace, depth sensing becomes necessary. cuRoboV2 includes GPU-native perception and mapping directions, but deployment still depends on timestamping, transforms, filtering, and snapshots.
1. Start with manual obstacles
Before connecting RealSense or ZED, create a static scene:
world:
cuboid:
table:
dims: [1.4, 0.8, 0.06]
pose: [0.55, 0.0, -0.03, 1.0, 0.0, 0.0, 0.0]
safety_wall:
dims: [0.04, 1.2, 0.8]
pose: [0.85, 0.0, 0.4, 1.0, 0.0, 0.0, 0.0]
If planning is not stable with a static scene, depth will make debugging harder.
2. Depth pipeline
camera depth
-> depth filter
-> point cloud in camera frame
-> TF to robot base
-> voxel / TSDF / ESDF
-> planning snapshot
Every frame needs a timestamp. If TF lookup uses a transform 100 ms newer than the depth frame, obstacles will shift when the robot or camera moves.
3. Snapshot, not live mutation
The planner should receive an immutable snapshot:
scene = scene_buffer.get_latest(max_age=0.10)
result = planner.plan_pose(start, goal, scene)
Do not let a mapping thread mutate the scene object while the optimizer is reading it. Use copy-on-write or double buffering.
4. Practical filters
| Filter | Reason |
|---|---|
| crop workspace | discard unreachable points |
| floor/table removal | reduce false collisions |
| temporal median | reduce depth noise |
| inflation margin | absorb calibration error |
| unknown zone | avoid planning through unseen space |
On Jetson, keep perception deterministic and lightweight. Do not spend all GPU budget on mapping and starve planning.
5. Calibration checklist
camera_linktobase_linktransform is measured and versioned.- Depth scale is in meters.
- Point cloud table matches the cuboid table in the viewer.
- A physical 10 cm box appears as roughly 10 cm in the planning scene.
- Scene age is published in diagnostics.
6. When to use ESDF/TSDF
Cuboids and meshes are enough for fixed fixtures. ESDF/TSDF helps when the scene changes and obstacles are not simple shapes. A safe rollout path:
- Static cuboids.
- Dynamic cuboids from perception.
- Point-cloud voxel collision.
- TSDF/ESDF for complex workspaces.
Conclusion
Scene mapping is where many sim-successful systems fail on hardware. Keep snapshots explicit, timestamps tight, and obstacles simple first. Next we move to Unitree G1 arm-only planning, where base stability becomes a new constraint.