Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching

This article presents a robotic pick-and-place system that is capable of grasping and recognizing both known and novel objects in cluttered environments. The key new feature of the system is that it handles a wide range of object categories without needing any task-specific training data for novel o...

Full description

Bibliographic Details
Main Authors: Zeng, Andy (Author), Song, Shuran (Author), Yu, Kuan-Ting (Author), Donlon, Elliott S (Author), Hogan, Francois R. (Author), Bauza Villalonga, Maria (Author), Ma, Daolin (Author), Taylor, Orion Thomas (Author), Liu, Melody (Author), Romo, Eudald (Author), Fazeli, Nima (Author), Alet, Ferran (Author), Chavan Dafle, Nikhil Narsingh (Author), Holladay, Rachel (Author), Morona, Isabella (Author), Nair, Prem Qu (Author), Green, Druck (Author), Taylor, Ian (Author), Liu, Weber (Author), Funkhouser, Thomas (Author), Rodriguez, Alberto (Author)
Other Authors: Massachusetts Institute of Technology. Department of Mechanical Engineering (Contributor)
Format: Article
Language:English
Published: SAGE Publications, 2021-03-31T19:02:14Z.
Subjects:
Online Access:Get fulltext
Description
Summary:This article presents a robotic pick-and-place system that is capable of grasping and recognizing both known and novel objects in cluttered environments. The key new feature of the system is that it handles a wide range of object categories without needing any task-specific training data for novel objects. To achieve this, it first uses an object-agnostic grasping framework to map from visual observations to actions: inferring dense pixel-wise probability maps of the affordances for four different grasping primitive actions. It then executes the action with the highest affordance and recognizes picked objects with a cross-domain image classification framework that matches observed images to product images. Since product images are readily available for a wide range of objects (e.g., from the web), the system works out-of-the-box for novel objects without requiring any additional data collection or re-training. Exhaustive experimental results demonstrate that our multi-affordance grasping achieves high success rates for a wide variety of objects in clutter, and our recognition algorithm achieves high accuracy for both known and novel grasped objects. The approach was part of the MIT-Princeton Team system that took first place in the stowing task at the 2017 Amazon Robotics Challenge. All code, datasets, and pre-trained models are available online at http://arc.cs.princeton.edu/
NSF (Grants IIS-1251217, VEC 1539014/1539099)