Learning to Plan via Deep Optimistic Value Exploration
Deep exploration requires coordinated long-term planning. We present a model-based reinforcement learning algorithm that guides policy learning through a value function that exhibits optimism in the face of uncertainty. We capture uncertainty over values by combining predictions from an ensemble of...
Main Authors: | Seyde, Tim (Author), Schwarting, Wilko (Author), Karaman, Sertac (Author), Rus, Daniela L (Author) |
---|---|
Other Authors: | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory (Contributor), Massachusetts Institute of Technology. Laboratory for Information and Decision Systems (Contributor) |
Format: | Article |
Language: | English |
Published: |
2020-05-11T19:59:29Z.
|
Subjects: | |
Online Access: | Get fulltext |
Similar Items
-
Stochastic Dynamic Games in Belief Space
by: Schwarting, Wilko, et al.
Published: (2022) -
Semi-Cooperative Control for Autonomous Emergency Vehicles
by: Buckman, Noam, et al.
Published: (2022) -
Sharing is Caring: Socially-Compliant Autonomous Intersection Negotiation
by: Buckman, Noam, et al.
Published: (2020) -
Parallel Autonomy in Automated Vehicles: Safe Motion Generation with Minimal Intervention
by: Schwarting, Wilko, et al.
Published: (2017) -
Variational Autoencoder for End-to-End Control of Autonomous Driving with Novelty Detection and Training De-biasing
by: Amini, Alexander, et al.
Published: (2018)