Learning to Plan via Deep Optimistic Value Exploration

Deep exploration requires coordinated long-term planning. We present a model-based reinforcement learning algorithm that guides policy learning through a value function that exhibits optimism in the face of uncertainty. We capture uncertainty over values by combining predictions from an ensemble of...

Full description

Bibliographic Details
Main Authors:	Seyde, Tim (Author), Schwarting, Wilko (Author), Karaman, Sertac (Author), Rus, Daniela L (Author)
Other Authors:	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory (Contributor), Massachusetts Institute of Technology. Laboratory for Information and Decision Systems (Contributor)
Format:	Article
Language:	English
Published:	2020-05-11T19:59:29Z.
Subjects:	Article
Online Access:	Get fulltext

Internet

Get fulltext

Learning to Plan via Deep Optimistic Value Exploration

Internet

Similar Items