researchresearchconferencerobotics

IROS 2026: Navigation and Manipulation Take Center Stage

Analysis of notable papers on autonomous navigation and manipulation — preparing for IROS 2026 Pittsburgh.

Nguyen Anh Tuan2 tháng 4, 20263 phút đọc
IROS 2026: Navigation and Manipulation Take Center Stage

IROS 2026 Pittsburgh — Navigation and Manipulation Converge

IROS 2026 (Sept 27 - Oct 1, Pittsburgh) continues industry dominance with autonomous navigation and robot manipulation as the top themes. The boundary between navigation and manipulation is blurring — robots increasingly need both skills simultaneously in complex environments.

  1. Hybrid Motion Planning with Deep RL: Combines classical planning (A*, RRT) with learned local components for better generalization in dynamic environments.

  2. Human-like Navigation using VLM Reasoning: Robots learn social norms and human-like behavior through Vision-Language Model reasoning, not just collision avoidance.

  3. Decentralized Multi-Robot Coordination: Multi-robot navigation in GPS-denied, communication-limited environments using hierarchical topological sharing.

Manipulation Advances

  1. DexUMI: Human Hand as Universal Interface: Using hand tracking (MediaPipe) directly as teleoperation interface for collecting dexterous manipulation data.

  2. Sim-to-Real for Long-Horizon Pick-and-Place: Fully autonomous long-horizon tasks with robust perception pipeline, adaptive grasp planning, and error recovery.

  3. Dexterous Manipulation via Imitation Learning: Comprehensive survey of state-of-the-art methods covering data collection, learning approaches, and sim-to-real transfer.

Notable Papers

Hybrid Motion Planning (arXiv:2512.24651)

Classical planners (A*, RRT) ensure completeness but are slow. RL planners react fast but lack global reasoning. This paper combines both:

  • Global planner: ensures robot reaches destination
  • RL local planner: handles dynamic obstacles in real-time
  • Smart switching based on uncertainty estimation

Result: 15% improvement in success rate vs pure RL, 25% faster than pure classical planning.

Human-like Navigation (arXiv:2509.21189)

Robots typically treat people as obstacles. This paper enables social navigation:

  • Vision-Language Model (VLM) reasoning about context
  • Robot understands doors, corridors, crowded areas
  • Chooses appropriate behavior (wait, yield, detour) vs just avoiding

Practical impact: Essential for service robots in hospitals, hotels, shopping centers.

DexUMI Paper (arXiv:2505.21864)

Problem: Collecting dexterous manipulation data requires skilled operators, is expensive and slow.

Solution: Use human hand tracking directly as manipulation interface

  • Operator thaws hand, system captures via camera
  • Retargets to any robot hand automatically
  • 86% average success rate, transfers across platforms

Takeaway: Faster, cheaper, more natural data collection than traditional teleoperation.

Convergence: Mobile Manipulation

Navigation and manipulation blur together in IROS 2026. Robots must move to locations AND interact with environment. This mobile manipulation is the real-world requirement for service robots.

Foundation Models Everywhere

From navigation (VLM reasoning), manipulation (VLA models), perception (foundation detection models), to multi-robot (LLM task planning) — foundation models appear throughout. This represents a paradigm shift.

Safety is Mandatory

Papers on safe RL, formal verification, and human-aware planning increased significantly. Safety analysis is now required by reviewers — no longer optional.

For Vietnamese Engineers

Warehouse automation, e-commerce logistics demand both navigation (autonomous movement) and manipulation (picking items). IROS papers directly address this convergence.

Action items:

  1. Study hybrid motion planning for warehouse AMRs
  2. Implement imitation learning pipelines for manipulation
  3. Use foundation models for rapid prototyping
  4. Add safety constraints to RL policies before deployment

NT

Nguyễn Anh Tuấn

Robotics & AI Engineer. Building VnRobo — sharing knowledge about robot learning, VLA models, and automation.

Bài viết liên quan

NEWTutorial
Hướng dẫn GigaBrain-0: VLA + World Model + RL
vlaworld-modelreinforcement-learninggigabrainroboticsmanipulation

Hướng dẫn GigaBrain-0: VLA + World Model + RL

Hướng dẫn chi tiết huấn luyện VLA bằng World Model và Reinforcement Learning với framework RAMP từ GigaBrain — open-source, 3.5B params.

12/4/202611 phút đọc
NEWNghiên cứu
Gemma 4 và Ứng Dụng Trong Robotics
ai-perceptiongemmaedge-aifoundation-modelsrobotics

Gemma 4 và Ứng Dụng Trong Robotics

Phân tích kiến trúc Gemma 4 của Google — từ on-device AI đến ứng dụng thực tế trong điều khiển robot, perception và agentic workflows.

12/4/202612 phút đọc
NEWSo sánh
SimpleVLA-RL (5): So sánh với LeRobot
ai-perceptionvlareinforcement-learninglerobotresearchPhần 5

SimpleVLA-RL (5): So sánh với LeRobot

So sánh chi tiết SimpleVLA-RL và LeRobot: RL approach, VLA models, sim vs real, data efficiency — hai framework bổ trợ nhau.

11/4/202612 phút đọc