← Back to Blog
manipulationgraspingmanipulationdeep-learningpoint-cloud

Robot Grasping 101: Analytical to Learning-Based

From traditional force closure to deep learning grasp detection -- overview of modern grasping methods and how to choose right approach for your project.

Nguyen Anh Tuan6 tháng 2, 20268 min read
Robot Grasping 101: Analytical to Learning-Based

Why is Grasping the Hardest Problem in Manipulation?

For humans, picking up a coffee cup is unconscious -- eyes see, hand reaches, fingers close. For robots, this combines perception, planning, and control extremely complexly.

A robot arm must answer 3 questions before grasping: (1) where on the object to contact? (2) what orientation of gripper is stable? (3) how much force is enough without breaking? Each question is a separate problem, and solution methods evolved from analytical (geometry + physics calculations) to learning-based (learning from data) over the past two decades.

This is Part 1 of Robot Manipulation Masterclass series -- I'll go from classical theory to latest deep learning models, with comparison tables to help you choose the right approach for your project.

Robot gripper contacting object -- fundamental manipulation problem

Analytical Grasping: Physical Foundations

Force Closure -- Necessary and Sufficient Condition

The most important concept in analytical grasping is force closure: a grasp has force closure when its contact forces can resist any external wrench (force + moment) applied to the object.

Simple analogy: when holding a ball with 2 fingers (thumb and index), friction at the 2 contact points creates a wrench space large enough to hold the ball steady without slipping, even if you tilt your hand or someone pushes it gently.

Mathematical condition for force closure:

A grasp G with n contact points has force closure if and only if:
the convex hull of primitive wrenches encloses origin
in 6D wrench space (3 force + 3 torque).

Steps to compute:

  1. Define contact model: point contact with friction (PCWF), soft finger, or rigid body
  2. Compute friction cone: at each contact, create cone of allowable forces based on friction coefficient (mu)
  3. Map to wrench space: each contact force → wrench = [force; torque] via cross product with contact position
  4. Check force closure: does convex hull of all wrenches contain origin?

GraspIt! -- Classical Grasp Simulator

GraspIt! is a classical grasp planning simulator developed at Columbia University since 2004, still widely used for analytical grasping research.

GraspIt! allows:

# Install GraspIt! (Ubuntu)
sudo apt-get install libqt5-dev libsoqt520-dev libcoin-dev
git clone https://github.com/graspit-simulator/graspit.git
cd graspit && mkdir build && cd build
cmake .. && make -j$(nproc)

Limitation of analytical approach: needs accurate 3D object models, precise friction coefficients, and doesn't scale well to unknown objects. This is what motivated learning-based methods.

Learning-Based Grasping: Learning from Data

PointNetGPD -- Direct Point Cloud

PointNetGPD (Liang et al., 2018) is one of the first papers using PointNet to evaluate grasp quality directly from point cloud.

Core idea: instead of complex geometric analysis, let a neural network learn from 350K grasp samples on the YCB object set. Input is point cloud inside the gripper volume, output is a grasp quality score.

# PointNetGPD inference pipeline (simplified)
import torch
from pointnet_gpd import PointNetGPD

model = PointNetGPD(num_classes=2)
model.load_state_dict(torch.load("pointnetgpd_weights.pth"))

# 1. Sample grasp candidates from point cloud
grasp_candidates = sample_antipodal_grasps(point_cloud, num_samples=200)

# 2. For each candidate, crop point cloud inside gripper volume
for grasp in grasp_candidates:
    local_points = crop_points_in_gripper(point_cloud, grasp)
    # 3. PointNet predicts quality score
    score = model(local_points)
    grasp.quality = score

# 4. Select grasp with highest score
best_grasp = max(grasp_candidates, key=lambda g: g.quality)

Advantages: real-time, no object model needed, generalizes well to unseen objects.

Disadvantages: only supports parallel-jaw gripper, doesn't consider scene context (occlusion, clutter).

Contact-GraspNet -- 6-DoF in Clutter

Contact-GraspNet (Sundermeyer et al., 2021) solves PointNetGPD's limitations by generating 6-DoF grasps directly from scene point cloud, while accounting for clutter.

Deep learning for robot grasping -- from point cloud to grasp pose

Key breakthroughs:

# Contact-GraspNet inference (simplified)
from contact_graspnet import ContactGraspNet

model = ContactGraspNet.load_pretrained()

# Input: single-view depth image -> point cloud
point_cloud = depth_to_pointcloud(depth_image, camera_intrinsics)

# Output: set of 6-DoF grasps with confidence scores
grasps, scores, contact_points = model.predict(
    point_cloud,
    forward_passes=5  # multiple passes for uncertainty estimation
)

# Filter and rank
valid_grasps = grasps[scores > 0.5]
best_grasp = valid_grasps[scores.argmax()]

Method Comparison

Criterion Analytical (GraspIt!) PointNetGPD Contact-GraspNet
Input 3D mesh + friction Point cloud (local) Point cloud (scene)
Output Grasp + quality metric Grasp score 6-DoF grasps + scores
Unknown objects No (needs mesh) Yes Yes
Clutter handling No Limited Good
Speed Slow (optimization) Real-time ~0.5s/scene
Gripper type Multi-finger Parallel-jaw Parallel-jaw
Training data Not needed 350K grasps 17M grasps
Success rate (real) ~70-80% (known objects) ~85% >90%
Best use case Research, multi-finger Quick prototype Production clutter

Grasp Quality Metrics

Regardless of method, you need to measure grasp quality. Here are the most common metrics:

Epsilon Metric (Force Closure Quality)

The epsilon metric is the radius of the largest inscribed ball in the wrench space convex hull. Epsilon > 0 means force closure. Larger epsilon means the grasp is more robust against external disturbances.

# Compute epsilon metric
from scipy.spatial import ConvexHull
import numpy as np

def epsilon_metric(wrenches):
    """
    wrenches: (N, 6) array of primitive wrenches
    Returns: epsilon value (>0 = force closure)
    """
    hull = ConvexHull(wrenches)
    # Shortest distance from origin to each facet
    distances = []
    for eq in hull.equations:
        normal = eq[:-1]
        offset = eq[-1]
        dist = abs(offset) / np.linalg.norm(normal)
        distances.append(dist)
    return min(distances)

Grasp Success Rate (Empirical)

The most practical metric: run N grasp attempts and count successes. A grasp "succeeds" when the robot picks up the object, lifts it 10cm, and holds for 3 seconds without dropping.

Diversity and Coverage

Beyond quality, diversity also matters: a good grasp planner generates many candidates from different approach directions, giving the robot fallback options when the preferred grasp is blocked by obstacles.

Analytical vs. Learning: When to Use What?

Choose Analytical when:

Choose Learning-Based when:

Hybrid Approach

The 2025-2026 trend is to combine both: use a learning model to quickly generate grasp candidates, then use analytical metrics to verify and rank them. This is the approach being taken by Google DeepMind and UC Berkeley in their latest systems.

Robot arm performing grasping in cluttered environment

Hands-on: Running Contact-GraspNet

If you want to try it right away, here's the fastest setup:

# Clone repo
git clone https://github.com/NVlabs/contact_graspnet.git
cd contact_graspnet

# Install dependencies (Python 3.8+, CUDA 11.x)
pip install -r requirements.txt

# Download pre-trained weights
bash download_weights.sh

# Run inference on sample depth image
python contact_graspnet/inference.py \
    --np_path=test_data/scene_0.npy \
    --forward_passes=5 \
    --z_range=[0.2,1.2]

Output is a set of 6-DoF grasps visualized on the point cloud. From there you can integrate with a robot arm via ROS 2 or directly through inverse kinematics.

Resources

Next in Series

This is Part 1 of Robot Manipulation Masterclass. Coming up:


Related Posts

Related Posts

TutorialLeRobot Ecosystem: Hướng dẫn toàn diện 2026
ai-perceptionmanipulationtutorial

LeRobot Ecosystem: Hướng dẫn toàn diện 2026

Tổng quan LeRobot của Hugging Face -- models, datasets, hardware support và cách bắt đầu với $100.

22/3/20269 min read
Deep DiveDiffusion Policy: Cách mạng robot manipulation
ai-perceptiondiffusion-policymanipulationPart 4

Diffusion Policy: Cách mạng robot manipulation

Tại sao diffusion models là breakthrough cho robotics — multimodal distributions, high-dim actions và stability.

14/3/202610 min read
Deep DiveAction Chunking Transformers (ACT): Kiến trúc chi tiết
ai-perceptionmanipulationresearchPart 3

Action Chunking Transformers (ACT): Kiến trúc chi tiết

Phân tích ACT — tại sao predict nhiều actions cùng lúc tốt hơn, CVAE encoder và temporal ensembling.

11/3/202611 min read