Robotic Pick-and-Place of Novel Objects in Clutter with Multi-Affordance Grasping and Cross-Domain Image Matching

This paper presents a robotic pick-and-place system that is capable of grasping and recognizing both known and novel objects in cluttered environments. The key new feature of the system is that it handles a wide range of object categories without needing any task-specific training data for novel obj...

Full description

Bibliographic Details
Main Authors: Zeng, Andy (Author), Song, Shuran (Author), Yu, Kuan-Ting (Author), Donlon, Elliott S (Author), Hogan, Francois R. (Author), Bauza Villalonga, Maria (Author), Ma, Daolin (Author), Taylor, Orion Thomas (Author), Liu, Melody (Author), Romo, Eudald (Author), Fazeli, Nima (Author), Alet, Ferran (Author), Chavan Dafle, Nikhil Narsingh (Author), Holladay, Rachel (Author), Morena, Isabella (Author), Qu Nair, Prem (Author), Green, Druck (Author), Taylor, Ian (Author), Liu, Weber (Author), Funkhouser, Thomas (Author), Rodriguez, Alberto (Author)
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory (Contributor), Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science (Contributor), Massachusetts Institute of Technology. Department of Mechanical Engineering (Contributor)
Format: Article
Language:English
Published: Institute of Electrical and Electronics Engineers (IEEE), 2020-09-01T16:02:35Z.
Subjects:
Online Access:Get fulltext
Description
Summary:This paper presents a robotic pick-and-place system that is capable of grasping and recognizing both known and novel objects in cluttered environments. The key new feature of the system is that it handles a wide range of object categories without needing any task-specific training data for novel objects. To achieve this, it first uses a category-agnostic affordance prediction algorithm to select and execute among four different grasping primitive behaviors. It then recognizes picked objects with a cross-domain image classification framework that matches observed images to product images. Since product images are readily available for a wide range of objects (e.g., from the web), the system works out-of-the-box for novel objects without requiring any additional training data. Exhaustive experimental results demonstrate that our multi-affordance grasping achieves high success rates for a wide variety of objects in clutter, and our recognition algorithm achieves high accuracy for both known and novel grasped objects. The approach was part of the MIT-Princeton Team system that took 1st place in the stowing task at the 2017 Amazon Robotics Challenge. All code, datasets, and pre-trained models are available online at http://arc.cs.princeton.edu.
NSF (Grants IIS-1251217 and VEC 1539014/1539099)