papertitle_website

Abstract: Interactive grasping from clutter, akin to human dexterity, is one of the longest-standing problems in robot learning. Challenges stem from the intricacies of visual perception, the demand of precise motor skills, and the complex interplay between the two. In this work, we present Teacher-Augmented Policy Gradient (TAPG), a novel two-stage learning framework that synergizes reinforcement learning (RL) and policy distillation. After training a privileged teacher policy to master the motor control, TAPG facilitates guided, yet adaptive, learning of a sensorimotor policy, enabling it to navigate the intricacies of new observation space. We demostrate this ability by integrating TAPG with a promptable segmentation model. Our trained policies adeptly grasp a wide variety of objects from cluttered scenarios in simulation and the real-world based on human-understandble prompts. Furthermore, we show robust zero-shot transfer to novel objects.

Original	Tracked Segmentation		Original	Tracked Segmentation		Original	Tracked Segmentation

papertitle_heading

Teacher policy on unseen objects

Real-world deployment of TAPG policy on unseen objects

Grasping from clutter