Biased Exploration in Offline Hierarchical Reinforcement Learning

Bibliographic Details
Main Author:	Miller, Eric D.
Language:	English
Published:	Case Western Reserve University School of Graduate Studies / OhioLINK 2021
Subjects:	Computer Science Artificial Intelligence machine learning reinforcement learning offline learning biased sampling sampling hierarchy task hierarchy hierarchical reinforcement learning rl hrl exploration optimism offline reinforcement learning
Online Access:	http://rave.ohiolink.edu/etdc/view?acc_num=case160768140424212

id	ndltd-OhioLink-oai-etd.ohiolink.edu-case160768140424212
record_format	oai_dc
spelling	ndltd-OhioLink-oai-etd.ohiolink.edu-case1607681404242122021-08-03T07:16:40Z Biased Exploration in Offline Hierarchical Reinforcement Learning Miller, Eric D. Computer Science Artificial Intelligence machine learning reinforcement learning offline learning biased sampling sampling hierarchy task hierarchy hierarchical reinforcement learning rl hrl exploration optimism offline reinforcement learning A way of giving prior knowledge to a reinforcement learning agent is through a task hierarchy. When collecting data for offline learning with a task hierarchy, the structure of the hierarchy determines the distribution of data. In some cases, the hierarchy structure causes the data distribution to be skewed so that learning an effective policy from the collected data requires many samples. In this thesis, we address this problem. First, we determine the conditions when the hierarchy structure will cause some actions to be sampled with low probability, and describe when this sampling distribution will delay convergence. Second, we present three biased sampling algorithms to address the problem. These algorithms employ the novel strategy of exploring a different hierarchical MDP than the one in which the policy is to be learned. Exploring in these new MDPs improves the sampling distribution and the rate of convergence of the learned policy to optimal in the original MDP. Finally, we evaluate all of our methods and several baselines on several different reinforcement learning problems. Our experiments show that our methods outperform the baselines, often significantly, when the hierarchy has a problematic structure. Furthermore, they identify trade-offs between the proposed methods and suggest scenarios when each method should be used. 2021-01-26 English text Case Western Reserve University School of Graduate Studies / OhioLINK http://rave.ohiolink.edu/etdc/view?acc_num=case160768140424212 http://rave.ohiolink.edu/etdc/view?acc_num=case160768140424212 restricted--full text unavailable until 2022-01-15 This thesis or dissertation is protected by copyright: all rights reserved. It may not be copied or redistributed beyond the terms of applicable copyright laws.
collection	NDLTD
language	English
sources	NDLTD
topic	Computer Science Artificial Intelligence machine learning reinforcement learning offline learning biased sampling sampling hierarchy task hierarchy hierarchical reinforcement learning rl hrl exploration optimism offline reinforcement learning
spellingShingle	Computer Science Artificial Intelligence machine learning reinforcement learning offline learning biased sampling sampling hierarchy task hierarchy hierarchical reinforcement learning rl hrl exploration optimism offline reinforcement learning Miller, Eric D. Biased Exploration in Offline Hierarchical Reinforcement Learning
author	Miller, Eric D.
author_facet	Miller, Eric D.
author_sort	Miller, Eric D.
title	Biased Exploration in Offline Hierarchical Reinforcement Learning
title_short	Biased Exploration in Offline Hierarchical Reinforcement Learning
title_full	Biased Exploration in Offline Hierarchical Reinforcement Learning
title_fullStr	Biased Exploration in Offline Hierarchical Reinforcement Learning
title_full_unstemmed	Biased Exploration in Offline Hierarchical Reinforcement Learning
title_sort	biased exploration in offline hierarchical reinforcement learning
publisher	Case Western Reserve University School of Graduate Studies / OhioLINK
publishDate	2021
url	http://rave.ohiolink.edu/etdc/view?acc_num=case160768140424212
work_keys_str_mv	AT millerericd biasedexplorationinofflinehierarchicalreinforcementlearning
_version_	1719457906392301568

Biased Exploration in Offline Hierarchical Reinforcement Learning

Similar Items