← Back to Blog
humanoidhumanoidmpcmujocoilqrcontrol

Whole-Body MPC: Real-Time Full-Body Control

Whole-body Model Predictive Control with MuJoCo and iLQR — from contact dynamics theory to real-time deployment on humanoid robots.

Nguyen Anh Tuan15 tháng 2, 20266 min read
Whole-Body MPC: Real-Time Full-Body Control

Why MPC for Humanoids?

In Part 2, we learned about IK, operational space control, and QP-based whole-body control. But these methods are all reactive — they only look at current state to make decisions.

Model Predictive Control (MPC) is different: it looks into the future. MPC solves an optimization problem over a horizon (e.g., 0.5-1 second), calculates the best sequence of actions, executes the first step, then repeats. This is especially important for humanoids because:

  1. Contact planning: MPC can plan when to place feet, when to lift feet
  2. Balance recovery: Look ahead to prevent balance loss
  3. Whole-body coordination: Optimize locomotion and manipulation simultaneously

The most important recent paper: "Whole-Body Model-Predictive Control of Legged Robots with MuJoCo" (arXiv:2503.04613) — from John Z. Zhang (MIT), Taylor Howell, Yuval Tassa (Google DeepMind), and Guanya Shi (CMU). This paper proves MPC with MuJoCo can run real-time on real humanoids with zero-shot sim-to-real.

Model Predictive Control planning horizon for humanoid locomotion

MPC — Basic Theory

Optimization Problem

MPC solves this problem at each timestep:

minimize: sum_{t=0}^{T-1} [ l(x_t, u_t) ] + l_T(x_T)
subject to:
    x_{t+1} = f(x_t, u_t)       (dynamics)
    x_0 = x_current              (initial state)
    u_min <= u_t <= u_max        (actuator limits)
    h(x_t, u_t) >= 0             (constraints: friction, joint limits, ...)

Where:

Receding Horizon

MPC repeats every control cycle (e.g., every 1-10 ms):

  1. Measure current state x_0
  2. Solve optimization problem over horizon T
  3. Execute only u_0 (first step)
  4. Return to step 1

This creates a natural feedback loop — if disturbed (e.g., pushed), MPC automatically adjusts the plan.

iLQR — Optimization Algorithm for MPC

iterative Linear Quadratic Regulator (iLQR) is the most commonly used algorithm for MPC on humanoids. It's the nonlinear version of LQR.

iLQR Algorithm — Iterative Linear Quadratic Regulator

iLQR approximates nonlinear problems by:

  1. Forward pass: Simulate trajectory with current control
  2. Backward pass: Linearize dynamics around trajectory, solve Riccati for feedback gains
  3. Forward pass: Update trajectory with new gains
  4. Repeat until convergence

The beauty of iLQR is that it naturally handles contact-implicit control — the forward simulation automatically computes contact dynamics (using MuJoCo's convex optimization solver), and the backward pass derives optimal gains that respect these constraints.

import numpy as np
import mujoco

class iLQR_MPC:
    def __init__(self, model, horizon=50, dt=0.01):
        self.model = model
        self.T = horizon
        self.dt = dt
        self.nx = model.nq + model.nv  # state dimension
        self.nu = model.nu              # control dimension
        
    def rollout(self, x0, U):
        """Forward simulate with control sequence U."""
        data = mujoco.MjData(self.model)
        X = [x0.copy()]
        
        for t in range(self.T):
            data.qpos[:] = X[-1][:self.model.nq]
            data.qvel[:] = X[-1][self.model.nq:]
            data.ctrl[:] = U[t]
            mujoco.mj_step(self.model, data)
            x_next = np.concatenate([data.qpos.copy(), data.qvel.copy()])
            X.append(x_next)
        
        return X
    
    def solve(self, x0, x_ref, U_init, Q, R, Qf, max_iter=10):
        """Run iLQR MPC."""
        self.x_ref = x_ref
        U = [u.copy() for u in U_init]
        
        for iteration in range(max_iter):
            # Forward rollout
            X = self.rollout(x0, U)
            
            # Compute cost
            cost = self._compute_cost(X, U, Q, R, Qf)
            
            # Update (simplified)
            if iteration > 0 and cost >= prev_cost:
                break  # Converged
            
            prev_cost = cost
        
        return U[0], X  # Return first control and trajectory
    
    def _compute_cost(self, X, U, Q, R, Qf):
        """Compute total cost."""
        cost = 0
        for t in range(self.T):
            x_err = X[t] - self.x_ref
            cost += x_err.T @ Q @ x_err
            cost += U[t].T @ R @ U[t]
        x_final_err = X[-1] - self.x_ref
        cost += x_final_err.T @ Qf @ x_final_err
        return cost

MuJoCo MPC — Official Implementation

Paper arXiv:2503.04613 uses MuJoCo as the dynamics model directly in the iLQR loop. Key features:

  1. Finite-difference derivatives: Instead of analytical derivatives (complex with contact), use finite differences on MuJoCo simulator. Fast because MuJoCo steps are very fast.
  2. Contact-implicit: No explicit contact schedule needed — MuJoCo automatically handles contact in forward simulation.
  3. Real-time: Achieves ~100 Hz control rate on real hardware (Unitree H1, quadrupeds).

Official code: github.com/johnzhang3/mujoco_mpc_deploy

Contact Dynamics — The Biggest Challenge

Contact (between feet and ground) is the hardest part of humanoid dynamics. Three states exist:

  1. No contact: Foot in air, no contact forces
  2. Sliding contact: Foot slides on ground, friction
  3. Sticking contact: Foot adheres to ground, static friction

Friction Cone

Contact forces are bounded by friction cone:

||f_tangential|| <= mu * f_normal

In MPC, the friction cone is approximated as a pyramid to convert it to linear constraints.

Contact dynamics and friction cone in humanoid locomotion

Real-Time Constraints

To run MPC real-time on humanoids (100-1000 Hz control loop):

1. Reduce Horizon

Instead of 100 steps, use 20-50 with larger timesteps. Trade-off: shorter look-ahead but less accuracy.

2. Warm-starting

Use the previous solution as starting point for the current one. Since state doesn't change much between cycles, warm-start helps converge faster (usually 1-3 iLQR iterations instead of 10+).

def mpc_loop(model, x0, x_ref, Q, R, Qf):
    """MPC loop with warm-starting."""
    T = 30  # horizon
    U_prev = [np.zeros(model.nu) for _ in range(T)]
    
    while True:
        x_current = get_state()  # Read from sensors
        
        # Warm-start: shift previous solution
        U_init = U_prev[1:] + [U_prev[-1]]
        
        # Solve MPC (only 2-3 iterations)
        mpc = iLQR_MPC(model, horizon=T)
        u_opt, K = mpc.solve(x_current, x_ref, U_init, Q, R, Qf, max_iter=3)
        
        # Apply control
        apply_torque(u_opt)
        
        # Save for next warm-start
        U_prev = U_init

3. GPU Acceleration

MuJoCo MJX (JAX backend) and MJX-Warp (NVIDIA GPU) run dynamics on GPU. Especially useful for sampling-based MPC (e.g., MPPI) — sample thousands of trajectories in parallel.

MPC vs QP vs RL

Criterion QP (Part 2) MPC (iLQR) RL (Part 4)
Look-ahead No (reactive) Yes (0.5-1s horizon) Yes (learned)
Contact planning Needs explicit schedule Contact-implicit Learned
Computation ~1 ms ~5-50 ms ~0.1 ms (inference)
Robustness Medium Good (replans) Best
Domain transfer Good (model-based) Good (model-based) Needs sim-to-real
Dexterity Good Good Best
Setup effort Medium High High (training)

Hybrid MPC + RL

Latest trend: combine both:

Example: RL for locomotion, MPC for fine manipulation. Or RL provides warm-start for MPC.

Learning Resources

Next in This Series


Related Articles

Related Posts

TutorialBắt đầu với MuJoCo: Cài đặt đến mô phỏng robot đầu tiên
simulationmujocotutorialPart 2

Bắt đầu với MuJoCo: Cài đặt đến mô phỏng robot đầu tiên

Hands-on tutorial MuJoCo — cài đặt, tạo MJCF model, simulate robot arm và visualize kết quả với Python.

30/3/202611 min read
ComparisonSimulation cho Robotics: MuJoCo vs Isaac Sim vs Gazebo
simulationmujocoisaac-simPart 1

Simulation cho Robotics: MuJoCo vs Isaac Sim vs Gazebo

So sánh 3 simulator hàng đầu cho robotics — physics accuracy, tốc độ, ecosystem và khi nào dùng cái nào.

28/3/202610 min read
Unitree G1 vs H1 vs Tesla Optimus: So sánh humanoid 2026
humanoidroboticsresearch

Unitree G1 vs H1 vs Tesla Optimus: So sánh humanoid 2026

Phân tích chi tiết 3 nền tảng humanoid robot phổ biến nhất — specs, giá thành, SDK và khả năng ứng dụng thực tế.

23/3/202612 min read