id ndltd-OhioLink-oai-etd.ohiolink.edu-case160768140424212
record_format oai_dc
spelling ndltd-OhioLink-oai-etd.ohiolink.edu-case1607681404242122021-08-03T07:16:40Z Biased Exploration in Offline Hierarchical Reinforcement Learning Miller, Eric D. Computer Science Artificial Intelligence machine learning reinforcement learning offline learning biased sampling sampling hierarchy task hierarchy hierarchical reinforcement learning rl hrl exploration optimism offline reinforcement learning A way of giving prior knowledge to a reinforcement learning agent is through a task hierarchy. When collecting data for offline learning with a task hierarchy, the structure of the hierarchy determines the distribution of data. In some cases, the hierarchy structure causes the data distribution to be skewed so that learning an effective policy from the collected data requires many samples. In this thesis, we address this problem. First, we determine the conditions when the hierarchy structure will cause some actions to be sampled with low probability, and describe when this sampling distribution will delay convergence. Second, we present three biased sampling algorithms to address the problem. These algorithms employ the novel strategy of exploring a different hierarchical MDP than the one in which the policy is to be learned. Exploring in these new MDPs improves the sampling distribution and the rate of convergence of the learned policy to optimal in the original MDP. Finally, we evaluate all of our methods and several baselines on several different reinforcement learning problems. Our experiments show that our methods outperform the baselines, often significantly, when the hierarchy has a problematic structure. Furthermore, they identify trade-offs between the proposed methods and suggest scenarios when each method should be used. 2021-01-26 English text Case Western Reserve University School of Graduate Studies / OhioLINK http://rave.ohiolink.edu/etdc/view?acc_num=case160768140424212 http://rave.ohiolink.edu/etdc/view?acc_num=case160768140424212 restricted--full text unavailable until 2022-01-15 This thesis or dissertation is protected by copyright: all rights reserved. It may not be copied or redistributed beyond the terms of applicable copyright laws.
collection NDLTD
language English
sources NDLTD
topic Computer Science
Artificial Intelligence
machine learning
reinforcement learning
offline learning
biased sampling
sampling
hierarchy
task hierarchy
hierarchical reinforcement learning
rl
hrl
exploration
optimism
offline reinforcement learning
spellingShingle Computer Science
Artificial Intelligence
machine learning
reinforcement learning
offline learning
biased sampling
sampling
hierarchy
task hierarchy
hierarchical reinforcement learning
rl
hrl
exploration
optimism
offline reinforcement learning
Miller, Eric D.
Biased Exploration in Offline Hierarchical Reinforcement Learning
author Miller, Eric D.
author_facet Miller, Eric D.
author_sort Miller, Eric D.
title Biased Exploration in Offline Hierarchical Reinforcement Learning
title_short Biased Exploration in Offline Hierarchical Reinforcement Learning
title_full Biased Exploration in Offline Hierarchical Reinforcement Learning
title_fullStr Biased Exploration in Offline Hierarchical Reinforcement Learning
title_full_unstemmed Biased Exploration in Offline Hierarchical Reinforcement Learning
title_sort biased exploration in offline hierarchical reinforcement learning
publisher Case Western Reserve University School of Graduate Studies / OhioLINK
publishDate 2021
url http://rave.ohiolink.edu/etdc/view?acc_num=case160768140424212
work_keys_str_mv AT millerericd biasedexplorationinofflinehierarchicalreinforcementlearning
_version_ 1719457906392301568