Applying Deep Reinforcement Learning in Playing Simplified Scenarios of StarCraft II

碩士 === 國立臺北大學 === 資訊管理研究所 === 107 === One of the objectives of deep reinforcement learning (RL) is making an intelligent agent that is capable of making decisions or solving problems. Recently, artificial intelligence (AI) has won Atari games and defeated professional Go player, so AI researchers ha...

Full description

Bibliographic Details
Main Authors: SHEN, YI-XING, 沈易星
Other Authors: CHEN, TSUNG-TENG
Format: Others
Language:zh-TW
Published: 2019
Online Access:http://ndltd.ncl.edu.tw/handle/rgkh4f
id ndltd-TW-107NTPU0396011
record_format oai_dc
spelling ndltd-TW-107NTPU03960112019-07-27T03:39:23Z http://ndltd.ncl.edu.tw/handle/rgkh4f Applying Deep Reinforcement Learning in Playing Simplified Scenarios of StarCraft II 應用深度強化學習進行星海爭霸II之迷你遊戲 SHEN, YI-XING 沈易星 碩士 國立臺北大學 資訊管理研究所 107 One of the objectives of deep reinforcement learning (RL) is making an intelligent agent that is capable of making decisions or solving problems. Recently, artificial intelligence (AI) has won Atari games and defeated professional Go player, so AI researchers have being focused on more complex tasks – real-time strategy games. This study used StarCraft II artificial intelligence learning environment co-developed by DeepMind and Blizzard Entertainment to train and develop intelligent agents that are capable of playing StarCraft II. Our study uses the Asynchronous Advantage Actor-Critic (A3C) algorithm to train the RL model. We compare the action selection methods of two RL algorithms – the probability exploration method of the original A3C and the ε-greedy method. The results of the study show that the ε-greedy method learned a good game strategy faster. However, giving more training time, the probability exploration method does better eventually. We also propose a way to expedite the training process. We compare the difference between the growth rate of the cumulative rewards of the previous batch and this batch. We decrease the learning rate when the difference is larger than a predefined threshold. The results show that when the batch size is 20 and the decrement threshold is set to 30% to 40%, the agent can be trained more efficiently. Because it takes tremendous computing resources to learn the whole game of StarCraft II. Instead, we use the mini-games to train and learn RL models. The research environment established by this study can be utilized to training whole games in future studies. CHEN, TSUNG-TENG 陳宗天 2019 學位論文 ; thesis 63 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立臺北大學 === 資訊管理研究所 === 107 === One of the objectives of deep reinforcement learning (RL) is making an intelligent agent that is capable of making decisions or solving problems. Recently, artificial intelligence (AI) has won Atari games and defeated professional Go player, so AI researchers have being focused on more complex tasks – real-time strategy games. This study used StarCraft II artificial intelligence learning environment co-developed by DeepMind and Blizzard Entertainment to train and develop intelligent agents that are capable of playing StarCraft II. Our study uses the Asynchronous Advantage Actor-Critic (A3C) algorithm to train the RL model. We compare the action selection methods of two RL algorithms – the probability exploration method of the original A3C and the ε-greedy method. The results of the study show that the ε-greedy method learned a good game strategy faster. However, giving more training time, the probability exploration method does better eventually. We also propose a way to expedite the training process. We compare the difference between the growth rate of the cumulative rewards of the previous batch and this batch. We decrease the learning rate when the difference is larger than a predefined threshold. The results show that when the batch size is 20 and the decrement threshold is set to 30% to 40%, the agent can be trained more efficiently. Because it takes tremendous computing resources to learn the whole game of StarCraft II. Instead, we use the mini-games to train and learn RL models. The research environment established by this study can be utilized to training whole games in future studies.
author2 CHEN, TSUNG-TENG
author_facet CHEN, TSUNG-TENG
SHEN, YI-XING
沈易星
author SHEN, YI-XING
沈易星
spellingShingle SHEN, YI-XING
沈易星
Applying Deep Reinforcement Learning in Playing Simplified Scenarios of StarCraft II
author_sort SHEN, YI-XING
title Applying Deep Reinforcement Learning in Playing Simplified Scenarios of StarCraft II
title_short Applying Deep Reinforcement Learning in Playing Simplified Scenarios of StarCraft II
title_full Applying Deep Reinforcement Learning in Playing Simplified Scenarios of StarCraft II
title_fullStr Applying Deep Reinforcement Learning in Playing Simplified Scenarios of StarCraft II
title_full_unstemmed Applying Deep Reinforcement Learning in Playing Simplified Scenarios of StarCraft II
title_sort applying deep reinforcement learning in playing simplified scenarios of starcraft ii
publishDate 2019
url http://ndltd.ncl.edu.tw/handle/rgkh4f
work_keys_str_mv AT shenyixing applyingdeepreinforcementlearninginplayingsimplifiedscenariosofstarcraftii
AT chényìxīng applyingdeepreinforcementlearninginplayingsimplifiedscenariosofstarcraftii
AT shenyixing yīngyòngshēndùqiánghuàxuéxíjìnxíngxīnghǎizhēngbàiizhīmínǐyóuxì
AT chényìxīng yīngyòngshēndùqiánghuàxuéxíjìnxíngxīnghǎizhēngbàiizhīmínǐyóuxì
_version_ 1719231565745094656