← Back to Blog
simulationsimulationsim2realrl

Domain Randomization: The Key to Sim-to-Real Transfer

Theory and practice of domain randomization — visual, dynamics, sensor randomization with successful case studies.

Nguyen Anh Tuan26 tháng 3, 20269 min read
Domain Randomization: The Key to Sim-to-Real Transfer

What is Domain Randomization and Why Does It Work?

Domain Randomization (DR) is a technique for training AI policies on multiple variations of simulation instead of a single configuration. The core idea is simple: if a model has seen enough variations in sim, then reality is just one more variation.

Instead of trying to make simulation match reality 100% (which is nearly impossible), DR accepts that sim will never be perfect -- but if you train on a distribution wide enough, the policy will learn to generalize.

The original paper by Tobin et al. (2017) -- Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World -- proved that with sufficient randomization, an object detector trained entirely in sim can detect objects in the real world with high accuracy.

Why Does DR Work? -- Bayesian Perspective

From a Bayesian viewpoint, DR is equivalent to training on the posterior distribution of environment parameters:

P(policy works | real world) = integral P(policy works | env_params) * P(env_params) d(env_params)

When randomization range is wide enough, P(env_params) covers real-world params, so the policy naturally becomes robust. Reality is just one sample from the distribution the policy was trained on.

Domain randomization visualization for robot training

3 Types of Domain Randomization

1. Visual Domain Randomization

Visual DR randomizes everything related to what the camera sees. This is the most important type of DR for vision-based policies.

Parameters to randomize:

Parameter Reference Range Reason
Light intensity 200-5000 lux Real-world lighting changes continuously
Light color 3000-8000K From tungsten to daylight
Light position Entire workspace Shadow patterns change
Camera FOV 55-75 degrees Lens variations
Camera noise sigma 0.01-0.05 Sensor noise
Texture colors Full RGB range Surface appearance varies
Background Random images Distractor background
Distractor objects 0-10 objects Clutter in real environment

Implementation in Isaac Lab using YAML config:

# config/visual_randomization.yaml
# Isaac Lab visual domain randomization config

visual_randomization:
  lighting:
    intensity:
      distribution: "uniform"
      range: [300.0, 3000.0]
    color_temperature:
      distribution: "uniform"
      range: [3000.0, 7000.0]
    num_random_lights:
      distribution: "uniform_int"
      range: [1, 5]
    position:
      distribution: "uniform"
      range: [[-2.0, -2.0, 1.0], [2.0, 2.0, 3.0]]

  camera:
    fov:
      distribution: "uniform"
      range: [55.0, 75.0]
    position_noise:
      distribution: "gaussian"
      mean: 0.0
      std: 0.02
    rotation_noise_deg:
      distribution: "gaussian"
      mean: 0.0
      std: 3.0

  textures:
    table_color:
      distribution: "uniform"
      range: [[0.2, 0.2, 0.2], [0.9, 0.9, 0.9]]
    object_color:
      distribution: "uniform"
      range: [[0.1, 0.1, 0.1], [1.0, 1.0, 1.0]]
    randomize_background: true

  distractors:
    num_objects:
      distribution: "uniform_int"
      range: [0, 5]
    scale:
      distribution: "uniform"
      range: [0.02, 0.1]

2. Dynamics Domain Randomization

Dynamics DR randomizes physics parameters -- this is the most important type for control policies since it directly affects robot behavior.

"""
Dynamics Domain Randomization implementation.
Compatible with Isaac Lab ManagerBasedRLEnv.
"""
import numpy as np
from dataclasses import dataclass, field


@dataclass
class DynamicsRandomizationConfig:
    """Dynamics DR config with validated ranges."""

    # Object properties
    mass_scale_range: tuple = (0.5, 2.0)
    friction_range: tuple = (0.3, 1.5)
    restitution_range: tuple = (0.0, 0.5)

    # Robot joint properties
    joint_damping_scale: tuple = (0.8, 1.2)
    joint_friction_scale: tuple = (0.5, 2.0)
    actuator_strength_scale: tuple = (0.85, 1.15)

    # Control latency and noise
    action_delay_frames: tuple = (0, 3)
    action_noise_std: float = 0.02
    observation_noise_std: float = 0.01

    # Environment
    gravity_noise: float = 0.05  # m/s^2 variation around 9.81


@dataclass
class RandomizedParams:
    """Container for randomized parameters."""

    mass_scale: float = 1.0
    friction: float = 0.8
    restitution: float = 0.1
    damping_scale: float = 1.0
    joint_friction_scale: float = 1.0
    actuator_scale: float = 1.0
    action_delay: int = 0
    gravity: np.ndarray = field(
        default_factory=lambda: np.array([0.0, 0.0, -9.81])
    )


def sample_dynamics_params(
    config: DynamicsRandomizationConfig,
    rng: np.random.Generator | None = None,
) -> RandomizedParams:
    """Sample a set of parameters from DR config."""
    if rng is None:
        rng = np.random.default_rng()

    gravity_z = -9.81 + rng.normal(0, config.gravity_noise)

    return RandomizedParams(
        mass_scale=rng.uniform(*config.mass_scale_range),
        friction=rng.uniform(*config.friction_range),
        restitution=rng.uniform(*config.restitution_range),
        damping_scale=rng.uniform(*config.joint_damping_scale),
        joint_friction_scale=rng.uniform(*config.joint_friction_scale),
        actuator_scale=rng.uniform(*config.actuator_strength_scale),
        action_delay=rng.integers(*config.action_delay_frames),
        gravity=np.array([0.0, 0.0, gravity_z]),
    )


def apply_dynamics_params(env, params: RandomizedParams):
    """Apply randomized parameters to environment."""
    # Object physics
    env.scene.target_object.set_mass(
        env.scene.target_object.default_mass * params.mass_scale
    )
    env.scene.target_object.set_friction(params.friction)
    env.scene.target_object.set_restitution(params.restitution)

    # Robot joints
    for joint in env.robot.joints:
        joint.set_damping(joint.default_damping * params.damping_scale)
        joint.set_friction(
            joint.default_friction * params.joint_friction_scale
        )
        joint.set_max_force(
            joint.default_max_force * params.actuator_scale
        )

    # Gravity
    env.sim.set_gravity(params.gravity)

    # Action delay buffer
    env.action_delay = params.action_delay

    return env

3. Sensor Domain Randomization

Sensor DR simulates real sensor imperfections -- noise, delay, dropout, bias. This type is often overlooked but very important for sim-to-real transfer.

"""
Sensor Domain Randomization.
Wrap observation pipeline to add realistic noise.
"""
import numpy as np
from dataclasses import dataclass


@dataclass
class SensorRandomizationConfig:
    """Config for sensor noise and imperfections."""

    # Joint encoder noise
    joint_position_noise_std: float = 0.005    # radians
    joint_velocity_noise_std: float = 0.05     # rad/s
    joint_position_bias_range: tuple = (-0.01, 0.01)

    # IMU noise (based on MPU-6050 datasheet)
    imu_accel_noise_std: float = 0.1           # m/s^2
    imu_gyro_noise_std: float = 0.01           # rad/s
    imu_bias_instability: float = 0.001        # random walk

    # Force/torque sensor
    ft_noise_std: float = 0.5                  # Newton
    ft_dropout_prob: float = 0.01              # 1% chance of signal loss

    # Communication delay
    observation_delay_frames: tuple = (0, 2)


class SensorRandomizer:
    """Apply realistic sensor noise to observations."""

    def __init__(
        self,
        config: SensorRandomizationConfig,
        seed: int = 42,
    ):
        self.config = config
        self.rng = np.random.default_rng(seed)
        self.joint_bias = None
        self.imu_bias = None
        self._reset_biases()

    def _reset_biases(self):
        """Reset systematic biases each episode."""
        self.joint_bias = self.rng.uniform(
            *self.config.joint_position_bias_range, size=7
        )
        self.imu_bias = self.rng.normal(0, 0.01, size=6)

    def add_joint_noise(
        self, joint_pos: np.ndarray, joint_vel: np.ndarray
    ) -> tuple[np.ndarray, np.ndarray]:
        """Add noise to joint encoder readings."""
        noisy_pos = (
            joint_pos
            + self.rng.normal(
                0, self.config.joint_position_noise_std,
                size=joint_pos.shape,
            )
            + self.joint_bias[: len(joint_pos)]
        )
        noisy_vel = joint_vel + self.rng.normal(
            0, self.config.joint_velocity_noise_std,
            size=joint_vel.shape,
        )
        return noisy_pos, noisy_vel

    def add_imu_noise(
        self, accel: np.ndarray, gyro: np.ndarray
    ) -> tuple[np.ndarray, np.ndarray]:
        """Add noise to IMU readings with bias drift."""
        noisy_accel = accel + self.rng.normal(
            0, self.config.imu_accel_noise_std, size=3
        )
        noisy_gyro = gyro + self.rng.normal(
            0, self.config.imu_gyro_noise_std, size=3
        )
        # Bias instability (random walk)
        self.imu_bias[:3] += self.rng.normal(
            0, self.config.imu_bias_instability, size=3
        )
        self.imu_bias[3:] += self.rng.normal(
            0, self.config.imu_bias_instability, size=3
        )
        return noisy_accel + self.imu_bias[:3], noisy_gyro + self.imu_bias[3:]

    def add_ft_noise(self, force_torque: np.ndarray) -> np.ndarray:
        """Add noise and dropout to force/torque sensor."""
        noisy_ft = force_torque + self.rng.normal(
            0, self.config.ft_noise_std, size=force_torque.shape
        )
        if self.rng.random() < self.config.ft_dropout_prob:
            return np.zeros_like(force_torque)
        return noisy_ft

    def randomize_observation(self, obs: dict) -> dict:
        """Apply all sensor noise to observation dict."""
        noisy_obs = {}

        if "joint_pos" in obs and "joint_vel" in obs:
            noisy_obs["joint_pos"], noisy_obs["joint_vel"] = (
                self.add_joint_noise(obs["joint_pos"], obs["joint_vel"])
            )

        if "imu_accel" in obs and "imu_gyro" in obs:
            noisy_obs["imu_accel"], noisy_obs["imu_gyro"] = (
                self.add_imu_noise(obs["imu_accel"], obs["imu_gyro"])
            )

        if "force_torque" in obs:
            noisy_obs["force_torque"] = self.add_ft_noise(
                obs["force_torque"]
            )

        # Copy over non-randomized keys
        for key in obs:
            if key not in noisy_obs:
                noisy_obs[key] = obs[key]

        return noisy_obs

Sensor noise visualization in robotics simulation

Successful Case Studies

OpenAI Rubik's Cube (2019)

Paper: Solving Rubik's Cube with a Robot Hand

This is the most impressive demo of domain randomization. OpenAI trained the Shadow Dexterous Hand entirely in simulation using Automatic Domain Randomization (ADR) -- a technique that automatically expands randomization range once policy masters the current range.

Results:

ADR workflow:

Step 1: Start with narrow range -- friction in [0.9, 1.1]
Step 2: Train policy until success rate > 80%
Step 3: Expand range -- friction in [0.7, 1.3]
Step 4: Repeat until max range
Result: friction in [0.1, 2.0] -- policy so robust real world is "easy mode"

Agility Digit Humanoid Walking (2024)

Paper: Real-World Humanoid Locomotion with Reinforcement Learning

Agility Robotics trained locomotion policy for Digit humanoid robot:

ANYmal Parkour (2023)

Paper: Extreme Parkour with Legged Robots

ETH Zurich trained ANYmal quadruped to traverse extreme terrain:

Metrics and Evaluation

Metric Description Target
Sim success rate (no DR) Baseline performance >95%
Sim success rate (full DR) Robust performance >85%
Sim-to-real success rate Effective transfer >80%
Sim-to-real gap Difference sim vs real <15%
Adaptation time Time to adapt on real <10 episodes

Anti-patterns: Common Mistakes

1. Over-randomization

Randomizing too wide will prevent the policy from learning anything. For example randomizing friction from 0.01 to 10.0 -- this range is nonsensical since real-world friction is only 0.2-1.5.

Solution: Use System Identification to find nominal values, then randomize ±30-50% around them.

2. Correlated Parameters

In reality, many parameters are correlated. For example: heavy objects usually have higher friction (metal vs plastic). Independent randomization creates nonsensical combinations.

# WRONG: randomize independently
mass = np.random.uniform(0.1, 5.0)
friction = np.random.uniform(0.1, 2.0)

# RIGHT: randomize with correlation by material
material = np.random.choice(["plastic", "metal", "rubber", "wood"])
material_params = {
    "plastic": {"mass_range": (0.05, 0.5), "friction_range": (0.3, 0.6)},
    "metal":   {"mass_range": (0.5, 5.0),  "friction_range": (0.2, 0.5)},
    "rubber":  {"mass_range": (0.1, 1.0),  "friction_range": (0.8, 1.5)},
    "wood":    {"mass_range": (0.1, 2.0),  "friction_range": (0.4, 0.8)},
}
params = material_params[material]
mass = np.random.uniform(*params["mass_range"])
friction = np.random.uniform(*params["friction_range"])

3. No Curriculum

Throwing policy into full randomization from the start often prevents convergence. Always start with narrow range, increase gradually as training progresses.

4. Only Randomize Visuals, Ignore Dynamics

Many people focus only on visual DR because it's easier to implement. But for control tasks, dynamics DR matters much more. Policies failing due to 20% friction difference is more common than failing due to different lighting.

Robot arm in manufacturing environment

References

Important papers on domain randomization:

  1. Tobin et al., 2017 -- Domain Randomization for Sim-to-Real Transfer -- Foundational paper
  2. OpenAI, 2019 -- Solving Rubik's Cube (ADR) -- Automatic Domain Randomization
  3. Mehta et al., 2020 -- Active Domain Randomization -- Smarter DR parameter selection
  4. Muratore et al., 2022 -- Robot Learning from Randomized Simulations -- Comprehensive survey

Related Articles

Related Posts

Deep DiveDigital Twins và ROS 2: Simulation trong sản xuất
simulationros2digital-twinPart 6

Digital Twins và ROS 2: Simulation trong sản xuất

Ứng dụng simulation trong công nghiệp — digital twins, ROS 2 + Gazebo/Isaac integration cho nhà máy thông minh.

3/4/202611 min read
TutorialSim-to-Real Pipeline: Từ training đến robot thật
simulationsim2realtutorialPart 5

Sim-to-Real Pipeline: Từ training đến robot thật

End-to-end guide: train policy trong sim, evaluate, domain randomization, deploy lên robot thật và iterate.

2/4/202615 min read
TutorialNVIDIA Isaac Lab: GPU-accelerated RL training từ zero
simulationisaac-simrlPart 3

NVIDIA Isaac Lab: GPU-accelerated RL training từ zero

Setup Isaac Lab, train locomotion policy với 4096 parallel environments và domain randomization trên GPU.

1/4/202611 min read