Game-theoretic learning algorithm for a spatial coverage problem

In this paper we consider a class of dynamic vehicle routing problems, in which a number of mobile agents in the plane must visit target points generated over time by a stochastic process. It is desired to design motion coordination strategies in order to minimize the expected time between the appea...

Full description

Bibliographic Details
Main Authors: Savla, Ketan (Contributor), Frazzoli, Emilio (Contributor)
Other Authors: Massachusetts Institute of Technology. Department of Aeronautics and Astronautics (Contributor), Massachusetts Institute of Technology. Laboratory for Information and Decision Systems (Contributor)
Format: Article
Language:English
Published: Institute of Electrical and Electronics Engineers, 2010-12-17T16:38:31Z.
Subjects:
Online Access:Get fulltext
LEADER 01819 am a22002173u 4500
001 60305
042 |a dc 
100 1 0 |a Savla, Ketan  |e author 
100 1 0 |a Massachusetts Institute of Technology. Department of Aeronautics and Astronautics  |e contributor 
100 1 0 |a Massachusetts Institute of Technology. Laboratory for Information and Decision Systems  |e contributor 
100 1 0 |a Frazzoli, Emilio  |e contributor 
100 1 0 |a Frazzoli, Emilio  |e contributor 
100 1 0 |a Savla, Ketan  |e contributor 
700 1 0 |a Frazzoli, Emilio  |e author 
245 0 0 |a Game-theoretic learning algorithm for a spatial coverage problem 
260 |b Institute of Electrical and Electronics Engineers,   |c 2010-12-17T16:38:31Z. 
856 |z Get fulltext  |u http://hdl.handle.net/1721.1/60305 
520 |a In this paper we consider a class of dynamic vehicle routing problems, in which a number of mobile agents in the plane must visit target points generated over time by a stochastic process. It is desired to design motion coordination strategies in order to minimize the expected time between the appearance of a target point and the time it is visited by one of the agents. We cast the problem as a spatial game in which each agent's objective is to maximize the expected value of the à ¿time spent aloneà ¿ at the next target location and show that the Nash equilibria of the game correspond to the desired agent configurations. We propose learning-based control strategies that, while making minimal or no assumptions on communications between agents as well as the underlying distribution, provide the same level of steady-state performance achieved by the best known decentralized strategies. 
546 |a en_US 
655 7 |a Article 
773 |t 47th Annual Allerton Conference on Communication, Control, and Computing, 2009. Allerton 2009