Unitree H1: Running, Turning & Dynamic Motions

Walking is the foundation, but a truly useful humanoid needs much more. In this post, we push the Unitree H1 beyond walking to achieve running (2+ m/s), sharp turning, lateral walking, backward walking, and a command-conditioned policy that flexibly switches between gaits.

Building on the H1 basic training post, this post focuses on dynamic motions — where physics and control become truly interesting.

From Walking to Running: What Changes?

Physics of Running

Walking and running differ fundamentally in the flight phase:

Characteristic	Walking	Running
Flight phase	None (at least 1 foot on ground)	Yes (both feet leave ground)
Ground contact	~60% time per foot	~30-40% time
Peak force	~1.2x body weight	~2.5-3x body weight
Speed	0-1.5 m/s	1.5-3.3+ m/s
Froude number	< 1	> 1
Energy mode	Inverted pendulum	Spring-mass

import torch
import numpy as np

class RunningRewardExtension:
    """
    Extended reward function for running gaits.
    """

    def compute_running_rewards(self, state, command):
        rewards = {}

        # 1. Extended velocity tracking range
        vel_error = torch.sum(
            torch.square(command[:, :2] - state["base_lin_vel"][:, :2]),
            dim=1
        )
        sigma = torch.where(
            torch.abs(command[:, 0]) > 1.5,
            torch.tensor(0.5),   # Running: larger tolerance
            torch.tensor(0.25),  # Walking: tight tolerance
        )
        rewards["vel_tracking"] = torch.exp(-vel_error / sigma)

        # 2. Flight phase reward
        both_feet_air = (
            (state["foot_contact"][:, 0] < 0.5) &
            (state["foot_contact"][:, 1] < 0.5)
        ).float()
        running_command = (torch.abs(command[:, 0]) > 2.0).float()
        rewards["flight_phase"] = 0.3 * running_command * both_feet_air

        # 3. Ground reaction force symmetry
        left_grf = state["contact_forces"][:, 0, 2]
        right_grf = state["contact_forces"][:, 1, 2]
        grf_symmetry = 1.0 - torch.abs(
            left_grf.max() - right_grf.max()
        ) / (left_grf.max() + right_grf.max() + 1e-6)
        rewards["grf_symmetry"] = 0.1 * grf_symmetry

        # 4. Knee bend during stance (energy storage)
        stance_mask = state["foot_contact"] > 0.5
        knee_bend = torch.abs(state["joint_pos"][:, [3, 8]])
        running_knee_reward = stance_mask * torch.clamp(
            knee_bend - 0.2, min=0.0
        )
        rewards["knee_bend"] = 0.1 * torch.sum(running_knee_reward, dim=1)

        return rewards

Dynamic running motion

Command-Conditioned Multi-Gait Policy

Instead of training multiple separate policies, we train a single policy that can walk, run, turn, and move laterally — depending on the velocity command.

Extended Command Space

class ExtendedVelocityCommand:
    """
    Extended velocity command for multi-gait policy.
    """

    def __init__(self):
        self.ranges = {
            "lin_vel_x": (-0.5, 3.5),    # Backward to Running
            "lin_vel_y": (-0.8, 0.8),     # Lateral walking
            "ang_vel_z": (-1.5, 1.5),     # Sharp turning
        }
        self.curriculum_factor = 0.0

    def sample_command(self, num_envs, device="cuda"):
        """Sample random velocity commands."""
        factor = self.curriculum_factor

        commands = torch.zeros(num_envs, 3, device=device)

        commands[:, 0] = torch.empty(num_envs, device=device).uniform_(
            self.ranges["lin_vel_x"][0] * factor,
            self.ranges["lin_vel_x"][1] * factor + (1 - factor) * 1.0,
        )

        commands[:, 1] = torch.empty(num_envs, device=device).uniform_(
            self.ranges["lin_vel_y"][0] * factor,
            self.ranges["lin_vel_y"][1] * factor,
        )

        commands[:, 2] = torch.empty(num_envs, device=device).uniform_(
            self.ranges["ang_vel_z"][0] * factor,
            self.ranges["ang_vel_z"][1] * factor,
        )

        # 20% chance of zero command (standing)
        zero_mask = torch.rand(num_envs, device=device) < 0.2
        commands[zero_mask] = 0.0

        return commands

    def update_curriculum(self, iteration, total_iterations=10000):
        """Gradually expand command range."""
        self.curriculum_factor = min(iteration / (total_iterations * 0.5), 1.0)

Multi-Gait Reward Function

class MultiGaitReward:
    """
    Speed-adaptive reward function.
    Low speed -> walking rewards. High speed -> running rewards.
    """

    def compute(self, state, action, prev_action, command):
        rewards = {}
        cmd_speed = torch.abs(command[:, 0])

        # === Velocity tracking (universal) ===
        vel_error = torch.sum(
            torch.square(command[:, :2] - state["base_lin_vel"][:, :2]),
            dim=1
        )
        rewards["vel_tracking"] = 1.5 * torch.exp(-vel_error / 0.25)

        # === Yaw tracking ===
        yaw_error = torch.square(command[:, 2] - state["base_ang_vel"][:, 2])
        rewards["yaw_tracking"] = 0.8 * torch.exp(-yaw_error / 0.25)

        # === Speed-adaptive rewards ===
        left_c = state["foot_contact"][:, 0]
        right_c = state["foot_contact"][:, 1]
        both_air = (1 - left_c) * (1 - right_c)

        # Walking (< 1.5 m/s): penalize both feet in air
        walk_mask = (cmd_speed < 1.5).float()
        rewards["walk_contact"] = -0.5 * walk_mask * both_air

        # Running (> 2.0 m/s): reward flight phase
        run_mask = (cmd_speed > 2.0).float()
        rewards["run_flight"] = 0.3 * run_mask * both_air

        # === Adaptive foot clearance ===
        target_clearance = torch.where(
            cmd_speed > 2.0,
            torch.tensor(0.12),
            torch.tensor(0.08),
        )
        swing = state["foot_contact"] < 0.5
        clearance = torch.where(
            swing,
            torch.clamp(state["foot_height"] - target_clearance.unsqueeze(1), min=0.0),
            torch.zeros_like(state["foot_height"]),
        )
        rewards["clearance"] = 0.3 * torch.sum(clearance, dim=1)

        # === Adaptive feet air time ===
        target_air = torch.where(
            cmd_speed > 2.0,
            torch.tensor(0.25),
            torch.tensor(0.35),
        )
        air_error = torch.abs(state["feet_air_time"] - target_air.unsqueeze(1))
        rewards["air_time"] = 0.2 * torch.sum(
            torch.exp(-air_error / 0.1), dim=1
        )

        # === Turning rewards ===
        cmd_yaw = torch.abs(command[:, 2])
        turning_mask = (cmd_yaw > 0.5).float()
        vel_x_error_turning = torch.abs(
            state["base_lin_vel"][:, 0] - command[:, 0] * 0.7
        )
        rewards["turning_vel"] = 0.3 * turning_mask * torch.exp(
            -vel_x_error_turning
        )

        # === Regularization ===
        rewards["action_rate"] = -0.01 * torch.sum(
            torch.square(action - prev_action), dim=1
        )
        rewards["torque"] = -3e-5 * torch.sum(
            torch.square(state["torques"]), dim=1
        )
        rewards["upright"] = -1.5 * torch.sum(
            torch.square(state["projected_gravity"][:, :2]), dim=1
        )
        rewards["termination"] = -200.0 * state["terminated"].float()

        total = sum(rewards.values())
        return total, rewards

Sharp Turning

Turning is challenging for humanoids because it requires coordinated hip yaw and lateral weight shift simultaneously:

class TurningAnalysis:
    """Analyze turning performance."""

    def evaluate_turning(self, env, policy, yaw_rates=[0.5, 1.0, 1.5]):
        """Evaluate turning at multiple yaw rates."""

        results = {}
        for target_yaw in yaw_rates:
            command = torch.tensor([[0.5, 0.0, target_yaw]])

            actual_yaws = []
            turning_radii = []
            obs = env.reset()

            for step in range(500):
                action = policy(obs)
                obs, _, done, info = env.step(action)

                actual_yaw = info["base_ang_vel"][:, 2].mean().item()
                actual_yaws.append(actual_yaw)

                linear_vel = info["base_lin_vel"][:, 0].mean().item()
                if abs(actual_yaw) > 0.01:
                    radius = abs(linear_vel / actual_yaw)
                    turning_radii.append(radius)

            results[target_yaw] = {
                "tracking_error": abs(
                    np.mean(actual_yaws[-100:]) - target_yaw
                ),
                "avg_turning_radius": np.mean(turning_radii[-100:])
                    if turning_radii else float('inf'),
                "stability": np.std(actual_yaws[-100:]),
            }

        print(f"{'Target (rad/s)':<16} {'Error':>8} {'Radius (m)':>11} {'Stability':>10}")
        for yaw, r in results.items():
            print(f"{yaw:<16.1f} {r['tracking_error']:>8.3f} "
                  f"{r['avg_turning_radius']:>10.2f}m {r['stability']:>10.4f}")

        return results

Adversarial Motion Prior (AMP)

To achieve natural-looking running gaits, we can use AMP — using motion capture data as a prior:

class AMPReward:
    """
    Adversarial Motion Prior for natural-looking gaits.
    Discriminator distinguishes policy motion vs reference motion.
    """

    def __init__(self, reference_motions, obs_dim):
        import torch.nn as nn

        self.discriminator = nn.Sequential(
            nn.Linear(obs_dim * 2, 1024),
            nn.ReLU(),
            nn.Linear(1024, 512),
            nn.ReLU(),
            nn.Linear(512, 1),
        )

        self.reference_motions = reference_motions
        self.optimizer = torch.optim.Adam(
            self.discriminator.parameters(), lr=1e-5
        )

    def compute_style_reward(self, current_obs, next_obs):
        """
        Compute AMP style reward.
        High reward = policy motion looks like reference.
        """
        transition = torch.cat([current_obs, next_obs], dim=1)

        with torch.no_grad():
            d_output = self.discriminator(transition)

        style_reward = -torch.log(
            1 - torch.sigmoid(d_output) + 1e-6
        ).squeeze()

        return style_reward

    def update_discriminator(self, policy_transitions, reference_transitions):
        """Train discriminator: policy = fake, reference = real."""
        real_output = self.discriminator(reference_transitions)
        fake_output = self.discriminator(policy_transitions.detach())

        real_loss = torch.mean(torch.square(real_output - 1))
        fake_loss = torch.mean(torch.square(fake_output))
        loss = 0.5 * (real_loss + fake_loss)

        alpha = torch.rand(reference_transitions.shape[0], 1,
                          device=reference_transitions.device)
        interp = alpha * reference_transitions + (1 - alpha) * policy_transitions
        interp.requires_grad_(True)
        interp_output = self.discriminator(interp)
        grad = torch.autograd.grad(
            interp_output, interp,
            grad_outputs=torch.ones_like(interp_output),
            create_graph=True
        )[0]
        gp = torch.mean(torch.square(grad.norm(dim=1) - 1))
        loss += 10.0 * gp

        self.optimizer.zero_grad()
        loss.backward()
        self.optimizer.step()

        return loss.item()

Robot dynamic motion

Emergent Behaviors

With a multi-gait policy, the robot often develops emergent behaviors — not directly designed but naturally appearing:

Behavior	Trigger condition	Explanation
Arm swing	Running > 2 m/s	Policy uses arm momentum for balance
Head bob	Walking 0.5-1.0 m/s	Natural vertical oscillation
Foot rotation	Sharp turning	Pivot foot rotates to reduce friction
Stance widening	Lateral walking	Wider stance for stability
Deep knee bend	Running acceleration	Energy storage during stance phase

Full Training Config

# Multi-gait training — ~3h on RTX 4090
python source/standalone/workflows/rsl_rl/train.py \
    --task=Isaac-Velocity-Rough-H1-MultiGait-v0 \
    --num_envs=4096 \
    --max_iterations=15000 \
    --headless \
    --logger wandb \
    --wandb_project h1-multigait

# Evaluate multi-gait
python source/standalone/workflows/rsl_rl/play.py \
    --task=Isaac-Velocity-Rough-H1-MultiGait-v0 \
    --num_envs=4 \
    --checkpoint=logs/h1_multigait/model_15000.pt

For more on humanoid control methods, see Humanoid Control Methods. For parkour with legged robots, see Parkour Learning.

Summary

In this post, we extended the H1 policy to dynamic motions:

Running gaits with flight phase, GRF management, and elastic knee energy
Command-conditioned policy for walk/run/turn/lateral in a single policy
Sharp turning with coordinated hip yaw and lateral weight shift
AMP for natural-looking gaits using motion capture references
Emergent behaviors like arm swing and stance widening

Next post — Unitree H1-2: Enhanced Locomotion — explores the H1-2 with new hardware and loco-manipulation basics.

References

AMP: Adversarial Motion Priors for Stylized Physics-Based Character Animation — Peng et al., SIGGRAPH 2021
Expressive Whole-Body Control for Humanoid Robots — Cheng et al., RSS 2024
Walk These Ways: Tuning Robot Control for Generalization — Margolis & Agrawal, CoRL 2023
Learning Humanoid Locomotion with Transformers — Radosavovic et al., 2024

Building on the H1 basic training post, this post focuses on dynamic motions — where physics and control become truly interesting.

From Walking to Running: What Changes?

Physics of Running

Walking and running differ fundamentally in the flight phase:

Characteristic	Walking	Running
Flight phase	None (at least 1 foot on ground)	Yes (both feet leave ground)
Ground contact	~60% time per foot	~30-40% time
Peak force	~1.2x body weight	~2.5-3x body weight
Speed	0-1.5 m/s	1.5-3.3+ m/s
Froude number	< 1	> 1
Energy mode	Inverted pendulum	Spring-mass

import torch
import numpy as np

class RunningRewardExtension:
    """
    Extended reward function for running gaits.
    """

    def compute_running_rewards(self, state, command):
        rewards = {}

        # 1. Extended velocity tracking range
        vel_error = torch.sum(
            torch.square(command[:, :2] - state["base_lin_vel"][:, :2]),
            dim=1
        )
        sigma = torch.where(
            torch.abs(command[:, 0]) > 1.5,
            torch.tensor(0.5),   # Running: larger tolerance
            torch.tensor(0.25),  # Walking: tight tolerance
        )
        rewards["vel_tracking"] = torch.exp(-vel_error / sigma)

        # 2. Flight phase reward
        both_feet_air = (
            (state["foot_contact"][:, 0] < 0.5) &
            (state["foot_contact"][:, 1] < 0.5)
        ).float()
        running_command = (torch.abs(command[:, 0]) > 2.0).float()
        rewards["flight_phase"] = 0.3 * running_command * both_feet_air

        # 3. Ground reaction force symmetry
        left_grf = state["contact_forces"][:, 0, 2]
        right_grf = state["contact_forces"][:, 1, 2]
        grf_symmetry = 1.0 - torch.abs(
            left_grf.max() - right_grf.max()
        ) / (left_grf.max() + right_grf.max() + 1e-6)
        rewards["grf_symmetry"] = 0.1 * grf_symmetry

        # 4. Knee bend during stance (energy storage)
        stance_mask = state["foot_contact"] > 0.5
        knee_bend = torch.abs(state["joint_pos"][:, [3, 8]])
        running_knee_reward = stance_mask * torch.clamp(
            knee_bend - 0.2, min=0.0
        )
        rewards["knee_bend"] = 0.1 * torch.sum(running_knee_reward, dim=1)

        return rewards

Dynamic running motion

Command-Conditioned Multi-Gait Policy

Instead of training multiple separate policies, we train a single policy that can walk, run, turn, and move laterally — depending on the velocity command.

Extended Command Space

class ExtendedVelocityCommand:
    """
    Extended velocity command for multi-gait policy.
    """

    def __init__(self):
        self.ranges = {
            "lin_vel_x": (-0.5, 3.5),    # Backward to Running
            "lin_vel_y": (-0.8, 0.8),     # Lateral walking
            "ang_vel_z": (-1.5, 1.5),     # Sharp turning
        }
        self.curriculum_factor = 0.0

    def sample_command(self, num_envs, device="cuda"):
        """Sample random velocity commands."""
        factor = self.curriculum_factor

        commands = torch.zeros(num_envs, 3, device=device)

        commands[:, 0] = torch.empty(num_envs, device=device).uniform_(
            self.ranges["lin_vel_x"][0] * factor,
            self.ranges["lin_vel_x"][1] * factor + (1 - factor) * 1.0,
        )

        commands[:, 1] = torch.empty(num_envs, device=device).uniform_(
            self.ranges["lin_vel_y"][0] * factor,
            self.ranges["lin_vel_y"][1] * factor,
        )

        commands[:, 2] = torch.empty(num_envs, device=device).uniform_(
            self.ranges["ang_vel_z"][0] * factor,
            self.ranges["ang_vel_z"][1] * factor,
        )

        # 20% chance of zero command (standing)
        zero_mask = torch.rand(num_envs, device=device) < 0.2
        commands[zero_mask] = 0.0

        return commands

    def update_curriculum(self, iteration, total_iterations=10000):
        """Gradually expand command range."""
        self.curriculum_factor = min(iteration / (total_iterations * 0.5), 1.0)

Multi-Gait Reward Function

class MultiGaitReward:
    """
    Speed-adaptive reward function.
    Low speed -> walking rewards. High speed -> running rewards.
    """

    def compute(self, state, action, prev_action, command):
        rewards = {}
        cmd_speed = torch.abs(command[:, 0])

        # === Velocity tracking (universal) ===
        vel_error = torch.sum(
            torch.square(command[:, :2] - state["base_lin_vel"][:, :2]),
            dim=1
        )
        rewards["vel_tracking"] = 1.5 * torch.exp(-vel_error / 0.25)

        # === Yaw tracking ===
        yaw_error = torch.square(command[:, 2] - state["base_ang_vel"][:, 2])
        rewards["yaw_tracking"] = 0.8 * torch.exp(-yaw_error / 0.25)

        # === Speed-adaptive rewards ===
        left_c = state["foot_contact"][:, 0]
        right_c = state["foot_contact"][:, 1]
        both_air = (1 - left_c) * (1 - right_c)

        # Walking (< 1.5 m/s): penalize both feet in air
        walk_mask = (cmd_speed < 1.5).float()
        rewards["walk_contact"] = -0.5 * walk_mask * both_air

        # Running (> 2.0 m/s): reward flight phase
        run_mask = (cmd_speed > 2.0).float()
        rewards["run_flight"] = 0.3 * run_mask * both_air

        # === Adaptive foot clearance ===
        target_clearance = torch.where(
            cmd_speed > 2.0,
            torch.tensor(0.12),
            torch.tensor(0.08),
        )
        swing = state["foot_contact"] < 0.5
        clearance = torch.where(
            swing,
            torch.clamp(state["foot_height"] - target_clearance.unsqueeze(1), min=0.0),
            torch.zeros_like(state["foot_height"]),
        )
        rewards["clearance"] = 0.3 * torch.sum(clearance, dim=1)

        # === Adaptive feet air time ===
        target_air = torch.where(
            cmd_speed > 2.0,
            torch.tensor(0.25),
            torch.tensor(0.35),
        )
        air_error = torch.abs(state["feet_air_time"] - target_air.unsqueeze(1))
        rewards["air_time"] = 0.2 * torch.sum(
            torch.exp(-air_error / 0.1), dim=1
        )

        # === Turning rewards ===
        cmd_yaw = torch.abs(command[:, 2])
        turning_mask = (cmd_yaw > 0.5).float()
        vel_x_error_turning = torch.abs(
            state["base_lin_vel"][:, 0] - command[:, 0] * 0.7
        )
        rewards["turning_vel"] = 0.3 * turning_mask * torch.exp(
            -vel_x_error_turning
        )

        # === Regularization ===
        rewards["action_rate"] = -0.01 * torch.sum(
            torch.square(action - prev_action), dim=1
        )
        rewards["torque"] = -3e-5 * torch.sum(
            torch.square(state["torques"]), dim=1
        )
        rewards["upright"] = -1.5 * torch.sum(
            torch.square(state["projected_gravity"][:, :2]), dim=1
        )
        rewards["termination"] = -200.0 * state["terminated"].float()

        total = sum(rewards.values())
        return total, rewards

Sharp Turning

Turning is challenging for humanoids because it requires coordinated hip yaw and lateral weight shift simultaneously:

class TurningAnalysis:
    """Analyze turning performance."""

    def evaluate_turning(self, env, policy, yaw_rates=[0.5, 1.0, 1.5]):
        """Evaluate turning at multiple yaw rates."""

        results = {}
        for target_yaw in yaw_rates:
            command = torch.tensor([[0.5, 0.0, target_yaw]])

            actual_yaws = []
            turning_radii = []
            obs = env.reset()

            for step in range(500):
                action = policy(obs)
                obs, _, done, info = env.step(action)

                actual_yaw = info["base_ang_vel"][:, 2].mean().item()
                actual_yaws.append(actual_yaw)

                linear_vel = info["base_lin_vel"][:, 0].mean().item()
                if abs(actual_yaw) > 0.01:
                    radius = abs(linear_vel / actual_yaw)
                    turning_radii.append(radius)

            results[target_yaw] = {
                "tracking_error": abs(
                    np.mean(actual_yaws[-100:]) - target_yaw
                ),
                "avg_turning_radius": np.mean(turning_radii[-100:])
                    if turning_radii else float('inf'),
                "stability": np.std(actual_yaws[-100:]),
            }

        print(f"{'Target (rad/s)':<16} {'Error':>8} {'Radius (m)':>11} {'Stability':>10}")
        for yaw, r in results.items():
            print(f"{yaw:<16.1f} {r['tracking_error']:>8.3f} "
                  f"{r['avg_turning_radius']:>10.2f}m {r['stability']:>10.4f}")

        return results

Adversarial Motion Prior (AMP)

To achieve natural-looking running gaits, we can use AMP — using motion capture data as a prior:

class AMPReward:
    """
    Adversarial Motion Prior for natural-looking gaits.
    Discriminator distinguishes policy motion vs reference motion.
    """

    def __init__(self, reference_motions, obs_dim):
        import torch.nn as nn

        self.discriminator = nn.Sequential(
            nn.Linear(obs_dim * 2, 1024),
            nn.ReLU(),
            nn.Linear(1024, 512),
            nn.ReLU(),
            nn.Linear(512, 1),
        )

        self.reference_motions = reference_motions
        self.optimizer = torch.optim.Adam(
            self.discriminator.parameters(), lr=1e-5
        )

    def compute_style_reward(self, current_obs, next_obs):
        """
        Compute AMP style reward.
        High reward = policy motion looks like reference.
        """
        transition = torch.cat([current_obs, next_obs], dim=1)

        with torch.no_grad():
            d_output = self.discriminator(transition)

        style_reward = -torch.log(
            1 - torch.sigmoid(d_output) + 1e-6
        ).squeeze()

        return style_reward

    def update_discriminator(self, policy_transitions, reference_transitions):
        """Train discriminator: policy = fake, reference = real."""
        real_output = self.discriminator(reference_transitions)
        fake_output = self.discriminator(policy_transitions.detach())

        real_loss = torch.mean(torch.square(real_output - 1))
        fake_loss = torch.mean(torch.square(fake_output))
        loss = 0.5 * (real_loss + fake_loss)

        alpha = torch.rand(reference_transitions.shape[0], 1,
                          device=reference_transitions.device)
        interp = alpha * reference_transitions + (1 - alpha) * policy_transitions
        interp.requires_grad_(True)
        interp_output = self.discriminator(interp)
        grad = torch.autograd.grad(
            interp_output, interp,
            grad_outputs=torch.ones_like(interp_output),
            create_graph=True
        )[0]
        gp = torch.mean(torch.square(grad.norm(dim=1) - 1))
        loss += 10.0 * gp

        self.optimizer.zero_grad()
        loss.backward()
        self.optimizer.step()

        return loss.item()

Robot dynamic motion

Emergent Behaviors

With a multi-gait policy, the robot often develops emergent behaviors — not directly designed but naturally appearing:

Behavior	Trigger condition	Explanation
Arm swing	Running > 2 m/s	Policy uses arm momentum for balance
Head bob	Walking 0.5-1.0 m/s	Natural vertical oscillation
Foot rotation	Sharp turning	Pivot foot rotates to reduce friction
Stance widening	Lateral walking	Wider stance for stability
Deep knee bend	Running acceleration	Energy storage during stance phase

Full Training Config

# Multi-gait training — ~3h on RTX 4090
python source/standalone/workflows/rsl_rl/train.py \
    --task=Isaac-Velocity-Rough-H1-MultiGait-v0 \
    --num_envs=4096 \
    --max_iterations=15000 \
    --headless \
    --logger wandb \
    --wandb_project h1-multigait

# Evaluate multi-gait
python source/standalone/workflows/rsl_rl/play.py \
    --task=Isaac-Velocity-Rough-H1-MultiGait-v0 \
    --num_envs=4 \
    --checkpoint=logs/h1_multigait/model_15000.pt

For more on humanoid control methods, see Humanoid Control Methods. For parkour with legged robots, see Parkour Learning.

Summary

In this post, we extended the H1 policy to dynamic motions:

Running gaits with flight phase, GRF management, and elastic knee energy
Command-conditioned policy for walk/run/turn/lateral in a single policy
Sharp turning with coordinated hip yaw and lateral weight shift
AMP for natural-looking gaits using motion capture references
Emergent behaviors like arm swing and stance widening

Next post — Unitree H1-2: Enhanced Locomotion — explores the H1-2 with new hardware and loco-manipulation basics.

References

AMP: Adversarial Motion Priors for Stylized Physics-Based Character Animation — Peng et al., SIGGRAPH 2021
Expressive Whole-Body Control for Humanoid Robots — Cheng et al., RSS 2024
Walk These Ways: Tuning Robot Control for Generalization — Margolis & Agrawal, CoRL 2023
Learning Humanoid Locomotion with Transformers — Radosavovic et al., 2024

Unitree H1: Running, Turning & Dynamic Motions

From Walking to Running: What Changes?

Physics of Running

Command-Conditioned Multi-Gait Policy

Extended Command Space

Multi-Gait Reward Function

Sharp Turning

Adversarial Motion Prior (AMP)

Emergent Behaviors

Full Training Config

Summary

References

Nguyễn Anh Tuấn

Related Posts

Unitree H1: Full-size Humanoid Locomotion Training

Sim-to-Real cho Humanoid: Deployment Best Practices

Humanoid Parkour: Jumping, Climbing và Obstacle Course

Unitree H1: Running, Turning & Dynamic Motions

From Walking to Running: What Changes?

Physics of Running

Command-Conditioned Multi-Gait Policy

Extended Command Space

Multi-Gait Reward Function

Sharp Turning

Adversarial Motion Prior (AMP)

Emergent Behaviors

Full Training Config

Summary

References

Nguyễn Anh Tuấn

Related Posts

Unitree H1: Full-size Humanoid Locomotion Training

Sim-to-Real cho Humanoid: Deployment Best Practices

Humanoid Parkour: Jumping, Climbing và Obstacle Course