Planning for decentralized control of multiple robots under uncertainty

This paper presents a probabilistic framework for synthesizing control policies for general multi-robot systems that is based on decentralized partially observable Markov decision processes (Dec-POMDPs). Dec-POMDPs are a general model of decision-making where a team of agents must cooperate to optim...

Full description

Bibliographic Details
Main Authors: Amato, Christopher (Contributor), Cruz, Gabriel (Contributor), Maynor, Christopher A. (Contributor), How, Jonathan P. (Contributor), Kaelbling, Leslie P. (Contributor), Konidaris, George D. (Contributor)
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory (Contributor), Massachusetts Institute of Technology. Department of Aeronautics and Astronautics (Contributor), Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science (Contributor), Massachusetts Institute of Technology. Laboratory for Information and Decision Systems (Contributor)
Format: Article
Language:English
Published: Institute of Electrical and Electronics Engineers (IEEE), 2015-12-28T00:00:56Z.
Subjects:
Online Access:Get fulltext
Description
Summary:This paper presents a probabilistic framework for synthesizing control policies for general multi-robot systems that is based on decentralized partially observable Markov decision processes (Dec-POMDPs). Dec-POMDPs are a general model of decision-making where a team of agents must cooperate to optimize a shared objective in the presence of uncertainty. Dec-POMDPs also consider communication limitations, so execution is decentralized. While Dec-POMDPs are typically intractable to solve for real-world problems, recent research on the use of macro-actions in Dec-POMDPs has significantly increased the size of problem that can be practically solved. We show that, in contrast to most existing methods that are specialized to a particular problem class, our approach can synthesize control policies that exploit any opportunities for coordination that are present in the problem, while balancing uncertainty, sensor information, and information about other agents. We use three variants of a warehouse task to show that a single planner of this type can generate cooperative behavior using task allocation, direct communication, and signaling, as appropriate. This demonstrates that our algorithmic framework can automatically optimize control and communication policies for complex multi-robot systems.