humanoidhumanoidroboticsresearch

Humanoid Robots: From Research to Real-World Applications

Overview of humanoid robots — bipedal walking, whole-body control, and platforms like Tesla Optimus, Figure 02, and Unitree H1.

Nguyen Anh Tuan1 tháng 12, 202510 phút đọc
Humanoid Robots: From Research to Real-World Applications

What is a Humanoid Robot?

Over the past two years, if you've followed tech news, you've certainly seen videos of humanoid robots walking, folding clothes, or even making coffee. Humanoid robots — machines shaped like humans — are becoming the biggest tech race since smartphones.

But why do we need robots that look like humans? The answer is simple: our world is designed for humans. Stairs, door handles, keyboards, tools — everything is optimized for human bodies with two arms and two legs. A humanoid robot can operate in existing environments without requiring infrastructure changes.

Humanoid robot in research

Types of Humanoid Robots

Not all humanoids are created equal. Currently, there are 3 main types:

  • Full bipedal humanoid: Walks on two legs, most human-like. Examples: Tesla Optimus, Boston Dynamics Atlas, Unitree H1. Advantage: can navigate any terrain humans can walk. Disadvantage: difficult balance, energy-intensive.
  • Wheeled humanoid: Human upper body (torso + 2 arms) but moves via wheels. Examples: Figure 02 (factory version), 1X NEO Beta. Advantage: stable, energy-efficient, faster movement. Disadvantage: cannot climb stairs.
  • Upper-body humanoid: Only upper body, mounted on fixed platform or mobile base. Common in research and manufacturing.

Primary Applications

Humanoid robots target 4 major sectors:

  1. Manufacturing: Working on production lines alongside humans, especially repetitive or dangerous tasks
  2. Logistics & Warehouse: Pick-and-place, goods transport in warehouses
  3. Healthcare: Patient assistance, medical supply transport in hospitals
  4. Household: Long-term goal — household helpers, elderly care

Leading Humanoid Platforms 2024-2025

The humanoid race is hotter than ever. These are the most notable players.

Tesla Optimus Gen 2

Elon Musk is betting big on humanoids. Optimus Gen 2 (launched late 2024) shows significant improvements over Gen 1:

  • Height: 1.73m, Weight: 57kg (10kg lighter than Gen 1)
  • Actuators: Tesla-designed, 11 DOF per arm, walking speed 1.6 m/s
  • Approach: End-to-end neural networks — like Tesla's Full Self-Driving approach. Input is cameras + sensors, output is motor commands, no intermediate processing
  • Training: Uses Dojo supercomputer infrastructure, trained on thousands of hours of human teleoperation

Tesla's strength: they have factories for immediate deployment (Gigafactory) and massive AI training infrastructure. Weakness: hardware not as mature as robotics specialists.

Figure 02

Figure AI shocked the market with OpenAI partnership and investments from Jeff Bezos and Microsoft. Figure 02 is the second generation:

  • Manipulation: Can grasp soft objects (clothes, food) using tactile sensing on fingers
  • AI Integration: Uses OpenAI's multimodal models to understand natural language commands and reason about environments
  • Real-world deployment: Pilot at BMW factory in Spartanburg, South Carolina
  • Speech: Can engage in natural conversation, explain actions being performed

Figure 02's demo video talking to humans and handing an apple on request went viral — and it's not CGI.

Unitree H1 and G1

While Tesla and Figure target enterprise, Unitree (China) is democratizing humanoid robotics:

  • H1: 1.8m tall, 47kg, priced around $90,000 — 5-10x cheaper than competitors
  • G1: Smaller (1.3m), priced from $16,000 — accessible to university labs
  • Open platform: ROS 2 support, open SDK, growing developer community
  • Performance: H1 achieves 3.3 m/s running speed (record for full-size humanoid)

Unitree is doing for humanoids what DJI did for drones — making cutting-edge technology accessible.

Boston Dynamics Atlas (Electric)

Boston Dynamics retired the legendary Atlas hydraulic and launched an entirely new electric Atlas in April 2024:

  • Transition: From hydraulic to electric actuators — lighter, quieter, more efficient
  • Range of motion: 360-degree rotating joints, can perform movements humans cannot
  • Commercial focus: First time Boston Dynamics targets commercialization (previously research-focused)
  • Partner: Hyundai (parent company) will be the first customer for manufacturing

1X NEO

Norwegian startup 1X Technologies takes a different approach — focusing on household applications:

  • Design: Softer appearance, less "mechanical", safer around people
  • Use cases: Cleaning, laundry folding, daily living assistance
  • Approach: Learned behaviors through imitation learning — humans demonstrate actions, robot learns
  • Backed by: OpenAI Fund, Tiger Global

AI technology in robotics

Core Technical Challenges

This is where it gets interesting for engineers. Humanoid robots represent an incredibly complex integration challenge — combining mechanics, electronics, and AI at the highest level.

Bipedal Locomotion

Walking on two legs sounds simple but is one of robotics' hardest problems. Humans take ~1 year to learn to walk, and our brains handle this unconsciously with millions of neurons.

Traditional approaches:

  • ZMP (Zero Moment Point): Maintain ground pressure point within the support polygon (contact area between foot and ground). This is how Honda ASIMO walks — stable but slow and rigid.
  • Inverted Pendulum Model: Model the robot as an inverted pendulum. Linear Inverted Pendulum Model (LIPM) simplifies dynamics to 2D, sufficient for center of mass trajectory calculation.
# Simplified LIPM - calculate CoM position over time
import numpy as np

g = 9.81  # gravity
z_c = 0.8  # CoM height (meters)
omega = np.sqrt(g / z_c)  # natural frequency

def lipm_trajectory(x0, xdot0, t):
    """Linear Inverted Pendulum Model trajectory"""
    x = x0 * np.cosh(omega * t) + (xdot0 / omega) * np.sinh(omega * t)
    xdot = x0 * omega * np.sinh(omega * t) + xdot0 * np.cosh(omega * t)
    return x, xdot

Modern approach — Reinforcement Learning:

The new trend is using RL to learn walking policies end-to-end. Robots train in simulation (usually Isaac Sim or MuJoCo), then transfer to real world — a technique called sim-to-real transfer.

# Example reward function for bipedal walking (simplified)
def compute_reward(robot_state):
    # Reward forward velocity
    forward_vel_reward = robot_state.linear_velocity[0] * 2.0
    
    # Penalize energy consumption
    energy_penalty = -0.01 * np.sum(np.abs(robot_state.joint_torques))
    
    # Penalize excessive body tilt
    orientation_penalty = -5.0 * np.abs(robot_state.body_roll)
    
    # Reward staying alive (not falling)
    alive_bonus = 1.0
    
    return forward_vel_reward + energy_penalty + orientation_penalty + alive_bonus

Agility Robotics (Digit) and Unitree both use RL for locomotion. Results: robots walk more naturally, recover from pushes, and adapt to uneven terrain.

Whole-Body Control

When a robot walks and manipulates simultaneously, you cannot control legs and arms independently — everything affects dynamics. Whole-body control solves this.

Task-space control: Instead of controlling individual joints, define tasks in Cartesian space (e.g., "right arm holds a cup at position xyz") and solvers find optimal joint angles.

Multi-contact dynamics: When the robot contacts multiple points (2 feet + 1 arm touching a table), calculate contact forces satisfying friction cone constraints. This is a quadratic programming (QP) problem solved in real-time at 1kHz.

# Conceptual whole-body QP formulation
# minimize: ||J_task @ qddot - task_acceleration||^2 + w * ||qddot||^2
# subject to:
#   M @ qddot + C @ qdot + g = S.T @ tau + J_contact.T @ f_contact
#   friction_cone(f_contact)  # Coulomb friction
#   joint_limits(q, qdot, qddot)
#   torque_limits(tau)

Dexterous Manipulation

Grasping objects is a major challenge. The human hand has 27 degrees of freedom and thousands of tactile receptors. Robotic hands are approaching this capability:

  • Tactile sensing: Sensors on fingertips measure force and object shape during contact. GelSight (MIT) uses cameras inside soft gel layers to "see" through contact — extremely high resolution.
  • Force control: Control force instead of position — critical when holding eggs (gentle) vs holding hammers (firm). Impedance control lets robots be "soft" when needed and "stiff" when needed.
  • In-hand manipulation: Rotating objects in hand without putting them down. OpenAI once demonstrated Shadow Hand rotating a Rubik's cube using RL — a significant milestone.

Wheeled Humanoid — Pragmatic Solution

Two-legged walking isn't always optimal. In warehouses and factories:

  • Flat floors, no stairs to climb
  • Need fast, stable movement
  • Battery life matters (wheels are 3-5x more energy-efficient than bipedal)

Wheeled humanoids combine humanoid's flexible upper body (dual-arm manipulation) with mobile robot's efficient lower body. This may be the winning form factor for industrial applications before bipedal humanoids mature.

Software Stack

If you want to start with humanoid robotics, this is the most common software stack.

ROS 2 + MoveIt2

ROS 2 (Robot Operating System) remains the de facto standard for robotics. MoveIt2 is the motion planning framework:

# Install MoveIt2 for ROS 2 Humble
sudo apt install ros-humble-moveit

# Launch motion planning demo with humanoid robot
ros2 launch humanoid_moveit_config demo.launch.py

MoveIt2 provides:

  • Motion planning: OMPL, STOMP, Pilz industrial motion planner
  • Collision checking: FCL-based, real-time
  • Kinematics: Inverse kinematics solvers (KDL, TRAC-IK, BioIK)
  • Perception integration: Point clouds + octomap for obstacle avoidance

NVIDIA Isaac Sim — Sim-to-Real

Isaac Sim is a game-changer for humanoid development:

  • Physics: PhysX 5 engine, accurate rigid body + soft body + fluid simulation
  • Rendering: Photorealistic — critical for vision-based policies
  • Domain randomization: Automatically vary lighting, textures, physics parameters for robust policies
  • Massive parallelism: Train thousands of robot instances simultaneously on GPU
# Isaac Sim - create humanoid training environment (conceptual)
from omni.isaac.lab.envs import ManagerBasedRLEnv

env_cfg = HumanoidWalkingEnvCfg()
env_cfg.scene.num_envs = 4096  # 4096 robots in parallel
env_cfg.rewards.forward_velocity.weight = 2.0
env_cfg.rewards.energy_consumption.weight = -0.01

env = ManagerBasedRLEnv(cfg=env_cfg)
# Train with PPO in a few hours → robot learns to walk

Foundation Models for Manipulation

Instead of training models for each task, foundation models enable robots to learn manipulation from diverse data:

  • RT-2 (Google DeepMind): Vision-Language-Action model — takes image + text command, outputs robot actions. Generalizes to unseen objects.
  • Octo: Open-source generalist robot policy from UC Berkeley. Trained on Open X-Embodiment dataset (800K+ episodes from 22 robot types).
  • π₀ (Physical Intelligence): Latest foundation model trained on Internet-scale data + robot data. Can fold laundry, bus tables — complex tasks requiring reasoning.
# Use Octo for manipulation (simplified)
from octo.model.octo_model import OctoModel

model = OctoModel.load_pretrained("hf://rail-berkeley/octo-base-1.5")

# Predict action from observation + language instruction
task = model.create_tasks(texts=["Pick up the red cup"])
action = model.sample_actions(
    observations={"image_primary": camera_image},
    tasks=task,
    rng=jax.random.PRNGKey(0)
)
# action = [dx, dy, dz, droll, dpitch, dyaw, gripper]

Robot in real-world applications

Vietnam Market: Opportunities and Challenges

Opportunities

Vietnam occupies an interesting position in the humanoid revolution:

  • Manufacturing hub: With Samsung, Intel, LG, Foxconn facilities in Vietnam, automation demand is huge. Humanoid robots can work on existing production lines without redesign.
  • Rising labor costs: Worker wages increase 7-10% annually, creating economic incentive for automation.
  • Logistics boom: Strong e-commerce growth (Shopee, TikTok Shop), warehouse automation is pressing.
  • Aging population: Declining birth rate means Vietnam will face serious labor shortage post-2030. Healthcare robots will be necessity, not luxury.

Challenges

  • Capital investment: A humanoid robot costs $50K-500K, beyond most Vietnamese SMEs' budgets.
  • Infrastructure: Many factories not yet digitalized, lack IT systems for complex robot integration.
  • Workforce: Shortage of deep robotics expertise. Vietnamese universities lack comprehensive humanoid robotics programs.
  • Regulation: No legal framework for robots working alongside humans (robot safety standards are mandatory requirements from FDI factories).

Near-term Application Potential (2025-2028)

  1. Wheeled humanoid in warehouses: Pick-and-place for e-commerce fulfillment. Fastest ROI.
  2. Upper-body humanoid in manufacturing: Mounted on mobile base, performing assembly tasks.
  3. Service robots in hospitality: Hotels, restaurants — Vietnam has a large tourism sector.

Long-term (2028+), as humanoid prices drop below $20K (Unitree is pursuing this), expect humanoid adoption in healthcare and households.

Next Steps

Humanoid robotics is at an inflection point — like smartphones in 2007. Technology is mature enough, prices are dropping fast, and use cases are becoming clear. In 5 years, humanoid robots will transition from cool demos to real production tools.

To learn more:

Humanoid robots are more than machines — they're the physical interface between AI and the real world. And the game is just beginning.

NT

Nguyễn Anh Tuấn

Robotics & AI Engineer. Building VnRobo — sharing knowledge about robot learning, VLA models, and automation.

Bài viết liên quan

NEWTutorial
NVIDIA Newton 1.0: GPU Physics 475x Nhanh Hơn MJX
simulationnvidiaphysics-enginegpusim-to-realisaac-labrobotics

NVIDIA Newton 1.0: GPU Physics 475x Nhanh Hơn MJX

Hướng dẫn thực hành NVIDIA Newton 1.0 — physics engine mã nguồn mở nhanh nhất cho sim-to-real robotics, tăng tốc 475x so với MJX trên GPU.

17/4/202611 phút đọc
NEWTutorial
Fine-Tune GR00T N1.6 với Cosmos Reason 2
grootnvidiavlacosmosfine-tuninghumanoidisaac

Fine-Tune GR00T N1.6 với Cosmos Reason 2

Hướng dẫn chi tiết fine-tune NVIDIA GR00T N1.6 — VLA model 3B tham số kết hợp Cosmos Reason 2 để điều khiển humanoid robot từ ảnh và ngôn ngữ.

15/4/202611 phút đọc
NEWTutorial
GEAR-SONIC: Whole-Body Control cho Humanoid Robot
humanoidwhole-body-controlnvidiareinforcement-learningmotion-trackingvr-teleoperationisaac-lab

GEAR-SONIC: Whole-Body Control cho Humanoid Robot

Hướng dẫn chi tiết GEAR-SONIC của NVIDIA — huấn luyện whole-body controller cho humanoid robot với dataset BONES-SEED và VR teleoperation.

13/4/202612 phút đọc