ICRA 2026 Coming to Vienna
The IEEE International Conference on Robotics and Automation (ICRA) 2026 will be held June 1-5 in Vienna, Austria — the first time ICRA comes to Europe's musical capital. As the world's largest robotics conference, ICRA 2026 promises to attract thousands of researchers and engineers with programs including plenary sessions, workshops, tutorials, robot competitions, and industry exhibits.
This year's General Chair is Prof. Markus Vincze from TU Wien — a leader in robot perception and manipulation in Europe. Notable workshops include "From Data to Decisions: VLA Pipelines for Real Robots" featuring 10,000+ hours of real robot data, Field Robotics Workshop focused on autonomous systems in agriculture and construction, and Robot Perception Workshop for unstructured environments.
In this article, I summarize 5 outstanding papers that Vietnamese robotics engineers should read before the conference — each representing an important research direction.
1. GR00T N1: Foundation Model for Humanoid Robots
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots — NVIDIA et al., 2025
GR00T N1 is NVIDIA's bold move in the foundation models race for robotics. It's a Vision-Language-Action (VLA) model with a dual-system architecture: System 2 (vision-language module) interprets the environment through vision and language instructions, while System 1 (diffusion transformer module) generates smooth motor actions in real-time. Both modules are tightly coupled and trained end-to-end.
The most striking feature is cross-embodiment support — GR00T N1 works from tabletop robot arms to dexterous humanoid robots. The model was successfully deployed on Fourier GR-1 humanoid for language-conditioned bimanual manipulation tasks. Training data includes real-robot trajectories, human videos, and synthetic datasets — a data heterogeneity strategy becoming the new standard.
Why it matters: NVIDIA open-sourced the model checkpoint (2B parameters), training data, and simulation benchmarks on GitHub and HuggingFace. This means labs and startups in Vietnam can start experimenting immediately without massive infrastructure. The updated GR00T N1.6 with performance improvements is also available.
Takeaway for Engineers
GR00T N1 is the best starting point if you want to try VLA models for humanoid robots. Weights, code, and data are all public — you just need a capable GPU to start fine-tuning for your use case.
2. FALCON: Force-Adaptive Humanoid Loco-Manipulation
FALCON: Learning Force-Adaptive Humanoid Loco-Manipulation — LeCAR Lab, 2025
Loco-manipulation — combining movement and object manipulation simultaneously — is one of the hardest challenges for humanoid robots. FALCON solves this with dual-agent reinforcement learning: a lower-body agent ensures stable locomotion under external forces, and an upper-body agent tracks end-effector positions with implicit adaptive force compensation.
Impressive real-world results: the same training setup deployed multiple different humanoid robots, performing tasks like payload transport (0-20N), cart pulling (0-100N), and door opening (0-40N) in the real world. The decomposition method — separating locomotion and manipulation then coordinating them — proves more effective than end-to-end approaches.
Why it matters: FALCON demonstrates that force-adaptive control is critical for real-world deployment. In Vietnam's industrial environments, where robots must move and manipulate in imperfect spaces, the ability to adapt to external forces is mandatory.
Takeaway for Engineers
Dual-agent decomposition (lower-body + upper-body) is a design pattern worth learning. Instead of trying to train one monolithic policy for the whole body, break the problem down and coordinate specialized agents.
3. Sim-to-Real RL for Dexterous Manipulation on Humanoids
Sim-to-Real Reinforcement Learning for Vision-Based Dexterous Manipulation on Humanoids — 2025
This paper provides a practical "recipe" for sim-to-real RL on humanoid robots — not just a proof-of-concept but a reproducible pipeline. The humanoid robot was trained for three challenging tasks: grasp-and-reach, box lift, and bimanual handover.
Four key contributions make the difference: (1) automated real-to-sim tuning module automatically calibrates simulation parameters, (2) generalized reward formulation based on contact and object goals, (3) divide-and-conquer policy distillation framework to train sub-tasks then combine, and (4) hybrid object representation strategy with modality-specific augmentation to bridge the reality gap.
Why it matters: This isn't about a new algorithm — it's about a complete process to bring dexterous manipulation from simulation to real robots. Each component has detailed ablation studies helping you understand why every step is necessary.
Takeaway for Engineers
If you're doing sim-to-real for manipulation, this paper is the "cookbook" to read. Automated real-to-sim tuning is often overlooked but has the biggest impact on transfer performance.
4. XHugWBC: Cross-Humanoid Whole-Body Control
Scalable and General Whole-Body Control for Cross-Humanoid Locomotion — 2026
One of the biggest barriers in humanoid robotics is that every robot has different mechanical architecture — different joint counts, proportions, dynamics. XHugWBC solves this with a cross-embodiment training framework: one single policy that generalizes to multiple different humanoid robots with zero-shot transfer.
Three critical elements: (1) physics-consistent morphological randomization — randomize robot structure during training, (2) semantically aligned observation and action spaces — normalize inputs/outputs for diverse robots, and (3) policy architectures modeling morphological and dynamical properties. Result: the policy successfully generalizes to 7 different humanoid robots with completely different kinematic, dynamic, and morphological structures.
Why it matters: Cross-embodiment is the future — instead of training separately for each robot, train once and deploy everywhere. This is especially relevant as the humanoid robot market explodes with many manufacturers (Unitree, Fourier, Figure, Tesla).
Takeaway for Engineers
Semantically aligned observation/action spaces is the key insight. When designing robot learning pipelines, think about how to abstract away robot-specific details from the start for easier transfer to other platforms later.
5. DeepFleet: Multi-Agent Foundation Models for Mobile Robots
DeepFleet: Multi-Agent Foundation Models for Mobile Robots — 2025
Not all hot research is about humanoids — fleet management for mobile robots remains the robotics work with the largest commercial impact. DeepFleet introduces a suite of four model architectures for learning how fleet robots move in structured warehouse environments.
Among the architectures, robot-centric model and graph-floor model perform best using asynchronous robot state updates and incorporating localized structure of robot interactions. Instead of traditional centralized planning, DeepFleet lets each robot make decisions based on local information and learned fleet dynamics — significantly reducing communication overhead.
Why it matters: Warehouse automation is the largest robotics market in Vietnam today. Logistics companies like Lazada, Shopee, and Viettel Post are all investing in AMR/AGV systems. DeepFleet provides foundation for the next generation of fleet coordination — more intelligent than today's rule-based systems.
Takeaway for Engineers
Graph-based representations for fleet coordination are the right direction — they capture spatial relationships between robots that flat vector representations miss. If you're building multi-robot systems, check out Robot Fleet Management: Solutions for Robot Team Coordination for more context.
Common Trends Across ICRA 2026
Looking at these 5 papers, three trends stand out:
Foundation Models Everywhere
From GR00T N1 for humanoid manipulation to DeepFleet for fleet coordination, foundation models are spreading across all robotics domains. Common thread: train on large-scale heterogeneous data, then fine-tune or zero-shot transfer for specific tasks.
Cross-Embodiment Becomes the Standard
XHugWBC and GR00T N1 both emphasize working across multiple robot platforms. The era of "one model per robot" is ending — the future belongs to generalist policies.
Sim-to-Real Pipelines Mature
FALCON and the sim-to-real RL paper show that the pipeline from simulation to real robots is mature enough to produce reliable results. Automated tuning and domain randomization are no longer "black art" but engineering discipline.
Advice Before ICRA 2026
- Read papers before talks — you'll understand discussions much better with prior context
- Focus on workshops — workshops have the highest quality discussions, especially this year's VLA Pipelines workshop
- Network at poster sessions — best opportunity to ask authors directly about implementation details
- Watch livestreams — ICRA usually streams plenary talks, very useful if you can't attend Vienna
If you want to build solid foundations before diving into these papers, read Top Robotics Research 2024-2025 for broader research context. Those interested in humanoid robots should also check Comprehensive Humanoid Robotics Guide.