VnRobo
AboutPricingBlogContact
🇻🇳VISign InStart Free Trial
🇻🇳VI
VnRobo logo

AI infrastructure for next-generation industrial robots.

Product

  • Features
  • Pricing
  • Knowledge Base
  • Services

Company

  • About Us
  • Blog
  • Contact

Legal

  • Privacy Policy
  • Terms of Service

© 2026 VnRobo. All rights reserved.

Made with♥in Vietnam
VnRobo
AboutPricingBlogContact
🇻🇳VISign InStart Free Trial
🇻🇳VI
  1. Home
  2. Blog
  3. Whole-Body MPC: Real-Time Full-Body Control
humanoidhumanoidmpcmujocoilqrcontrol

Whole-Body MPC: Real-Time Full-Body Control

Whole-body Model Predictive Control with MuJoCo and iLQR — from contact dynamics theory to real-time deployment on humanoid robots.

Nguyen Anh TuanFebruary 15, 20266 min readUpdated: Jun 14, 2026
Whole-Body MPC: Real-Time Full-Body Control

Why MPC for Humanoids?

In Part 2, we learned about IK, operational space control, and QP-based whole-body control. But these methods are all reactive — they only look at current state to make decisions.

Model Predictive Control (MPC) is different: it looks into the future. MPC solves an optimization problem over a horizon (e.g., 0.5-1 second), calculates the best sequence of actions, executes the first step, then repeats. This is especially important for humanoids because:

  1. Contact planning: MPC can plan when to place feet, when to lift feet
  2. Balance recovery: Look ahead to prevent balance loss
  3. Whole-body coordination: Optimize locomotion and manipulation simultaneously

The most important recent paper: "Whole-Body Model-Predictive Control of Legged Robots with MuJoCo" (arXiv:2503.04613) — from John Z. Zhang (MIT), Taylor Howell, Yuval Tassa (Google DeepMind), and Guanya Shi (CMU). This paper proves MPC with MuJoCo can run real-time on real humanoids with zero-shot sim-to-real.

Model Predictive Control planning horizon for humanoid locomotion
Model Predictive Control planning horizon for humanoid locomotion

MPC — Basic Theory

Optimization Problem

MPC solves this problem at each timestep:

minimize: sum_{t=0}^{T-1} [ l(x_t, u_t) ] + l_T(x_T)
subject to:
    x_{t+1} = f(x_t, u_t)       (dynamics)
    x_0 = x_current              (initial state)
    u_min <= u_t <= u_max        (actuator limits)
    h(x_t, u_t) >= 0             (constraints: friction, joint limits, ...)

Where:

  • x_t — state (joint positions, velocities)
  • u_t — control (joint torques or accelerations)
  • f — dynamics model
  • l — cost function (tracking error, energy, ...)
  • T — prediction horizon (lookahead steps)

Receding Horizon

MPC repeats every control cycle (e.g., every 1-10 ms):

  1. Measure current state x_0
  2. Solve optimization problem over horizon T
  3. Execute only u_0 (first step)
  4. Return to step 1

This creates a natural feedback loop — if disturbed (e.g., pushed), MPC automatically adjusts the plan.

iLQR — Optimization Algorithm for MPC

iterative Linear Quadratic Regulator (iLQR) is the most commonly used algorithm for MPC on humanoids. It's the nonlinear version of LQR.

iLQR Algorithm — Iterative Linear Quadratic Regulator

iLQR approximates nonlinear problems by:

  1. Forward pass: Simulate trajectory with current control
  2. Backward pass: Linearize dynamics around trajectory, solve Riccati for feedback gains
  3. Forward pass: Update trajectory with new gains
  4. Repeat until convergence

The beauty of iLQR is that it naturally handles contact-implicit control — the forward simulation automatically computes contact dynamics (using MuJoCo's convex optimization solver), and the backward pass derives optimal gains that respect these constraints.

import numpy as np
import mujoco

class iLQR_MPC:
    def __init__(self, model, horizon=50, dt=0.01):
        self.model = model
        self.T = horizon
        self.dt = dt
        self.nx = model.nq + model.nv  # state dimension
        self.nu = model.nu              # control dimension
        
    def rollout(self, x0, U):
        """Forward simulate with control sequence U."""
        data = mujoco.MjData(self.model)
        X = [x0.copy()]
        
        for t in range(self.T):
            data.qpos[:] = X[-1][:self.model.nq]
            data.qvel[:] = X[-1][self.model.nq:]
            data.ctrl[:] = U[t]
            mujoco.mj_step(self.model, data)
            x_next = np.concatenate([data.qpos.copy(), data.qvel.copy()])
            X.append(x_next)
        
        return X
    
    def solve(self, x0, x_ref, U_init, Q, R, Qf, max_iter=10):
        """Run iLQR MPC."""
        self.x_ref = x_ref
        U = [u.copy() for u in U_init]
        
        for iteration in range(max_iter):
            # Forward rollout
            X = self.rollout(x0, U)
            
            # Compute cost
            cost = self._compute_cost(X, U, Q, R, Qf)
            
            # Update (simplified)
            if iteration > 0 and cost >= prev_cost:
                break  # Converged
            
            prev_cost = cost
        
        return U[0], X  # Return first control and trajectory
    
    def _compute_cost(self, X, U, Q, R, Qf):
        """Compute total cost."""
        cost = 0
        for t in range(self.T):
            x_err = X[t] - self.x_ref
            cost += x_err.T @ Q @ x_err
            cost += U[t].T @ R @ U[t]
        x_final_err = X[-1] - self.x_ref
        cost += x_final_err.T @ Qf @ x_final_err
        return cost

MuJoCo MPC — Official Implementation

Paper arXiv:2503.04613 uses MuJoCo as the dynamics model directly in the iLQR loop. Key features:

  1. Finite-difference derivatives: Instead of analytical derivatives (complex with contact), use finite differences on MuJoCo simulator. Fast because MuJoCo steps are very fast.
  2. Contact-implicit: No explicit contact schedule needed — MuJoCo automatically handles contact in forward simulation.
  3. Real-time: Achieves ~100 Hz control rate on real hardware (Unitree H1, quadrupeds).

Official code: github.com/johnzhang3/mujoco_mpc_deploy

Contact Dynamics — The Biggest Challenge

Contact (between feet and ground) is the hardest part of humanoid dynamics. Three states exist:

  1. No contact: Foot in air, no contact forces
  2. Sliding contact: Foot slides on ground, friction
  3. Sticking contact: Foot adheres to ground, static friction

Friction Cone

Contact forces are bounded by friction cone:

||f_tangential|| <= mu * f_normal

In MPC, the friction cone is approximated as a pyramid to convert it to linear constraints.

Contact dynamics and friction cone in humanoid locomotion
Contact dynamics and friction cone in humanoid locomotion

Real-Time Constraints

To run MPC real-time on humanoids (100-1000 Hz control loop):

1. Reduce Horizon

Instead of 100 steps, use 20-50 with larger timesteps. Trade-off: shorter look-ahead but less accuracy.

2. Warm-starting

Use the previous solution as starting point for the current one. Since state doesn't change much between cycles, warm-start helps converge faster (usually 1-3 iLQR iterations instead of 10+).

def mpc_loop(model, x0, x_ref, Q, R, Qf):
    """MPC loop with warm-starting."""
    T = 30  # horizon
    U_prev = [np.zeros(model.nu) for _ in range(T)]
    
    while True:
        x_current = get_state()  # Read from sensors
        
        # Warm-start: shift previous solution
        U_init = U_prev[1:] + [U_prev[-1]]
        
        # Solve MPC (only 2-3 iterations)
        mpc = iLQR_MPC(model, horizon=T)
        u_opt, K = mpc.solve(x_current, x_ref, U_init, Q, R, Qf, max_iter=3)
        
        # Apply control
        apply_torque(u_opt)
        
        # Save for next warm-start
        U_prev = U_init

3. GPU Acceleration

MuJoCo MJX (JAX backend) and MJX-Warp (NVIDIA GPU) run dynamics on GPU. Especially useful for sampling-based MPC (e.g., MPPI) — sample thousands of trajectories in parallel.

MPC vs QP vs RL

Criterion QP (Part 2) MPC (iLQR) RL (Part 4)
Look-ahead No (reactive) Yes (0.5-1s horizon) Yes (learned)
Contact planning Needs explicit schedule Contact-implicit Learned
Computation ~1 ms ~5-50 ms ~0.1 ms (inference)
Robustness Medium Good (replans) Best
Domain transfer Good (model-based) Good (model-based) Needs sim-to-real
Dexterity Good Good Best
Setup effort Medium High High (training)

Hybrid MPC + RL

Latest trend: combine both:

  • RL learns rough policy (robust, fast)
  • MPC refines real-time (precise, optimal)

Example: RL for locomotion, MPC for fine manipulation. Or RL provides warm-start for MPC.

Learning Resources

  • Code: github.com/johnzhang3/mujoco_mpc_deploy — Real-world MPC deployment
  • Paper: arXiv:2503.04613 — Whole-Body MPC with MuJoCo
  • MuJoCo MPC: github.com/google-deepmind/mujoco_mpc — Official framework
  • Tutorial: REx Lab project page

Next in This Series

  • Part 2: Humanoid Control Basics: IK to Task-Space Control
  • Part 4: RL for Humanoid: From Humanoid-Gym to Sim2Real — Reinforcement learning approach
  • Part 5: Loco-Manipulation: Walking and Manipulating Together

Related Articles

  • Simulation for Robotics: MuJoCo vs Isaac Sim vs Gazebo — Simulator comparison
  • Getting Started with MuJoCo — Hands-on MuJoCo tutorial
  • RL for Bipedal Walking — RL for two-legged robots
  • Sim-to-Real Transfer — Moving from simulation to real hardware
NT

Nguyễn Anh Tuấn

Robotics & AI Engineer. Building VnRobo — sharing knowledge about robot learning, VLA models, and automation.

Khám phá VnRobo

Fleet MonitoringROS 2 IntegrationAMR Solutions
humanoid-engineering — Phần 3/6
← Humanoid Control Basics: IK to Task-Space ControlRL for Humanoid: From Humanoid-Gym to Sim2Real →

Related Posts

Deep Dive
Humanoid
humanoidcontrolinverse-kinematicsPart 2
humanoid

Humanoid control cơ bản: IK đến task-space control

Từ Inverse Kinematics đến operational space control -- lý thuyết và code Python cho điều khiển humanoid robot cân bằng và thực hiện task.

2/11/20269 min read
NT
Tutorial
MPC-RL: Huấn luyện Humanoid bằng MPC
humanoidmpcreinforcement-learning
humanoid

MPC-RL: Huấn luyện Humanoid bằng MPC

Hướng dẫn MPC-RL cho humanoid: dùng MPC làm reward trong PPO, cài repo, training, inference và đọc kết quả locomotion, push recovery.

6/13/202614 min read
NT
Deep Dive
Humanoid
humanoidcareermarketPart 6
humanoid

Tương lai Humanoid: Cơ hội cho kỹ sư robotics

Thị trường humanoid robot $38B vào 2035, xu hướng tuyển dụng, kỹ năng cần thiết và lộ trình học cho kỹ sư robotics Việt Nam.

2/27/20269 min read
NT
VnRobo logo

AI infrastructure for next-generation industrial robots.

Product

  • Features
  • Pricing
  • Knowledge Base
  • Services

Company

  • About Us
  • Blog
  • Contact

Legal

  • Privacy Policy
  • Terms of Service

© 2026 VnRobo. All rights reserved.

Made with♥in Vietnam