Applying Deep Reinforcement Learning in Playing Simplified Scenarios of StarCraft II

碩士 === 國立臺北大學 === 資訊管理研究所 === 107 === One of the objectives of deep reinforcement learning (RL) is making an intelligent agent that is capable of making decisions or solving problems. Recently, artificial intelligence (AI) has won Atari games and defeated professional Go player, so AI researchers ha...

Full description

Bibliographic Details
Main Authors:	SHEN, YI-XING, 沈易星
Other Authors:	CHEN, TSUNG-TENG
Format:	Others
Language:	zh-TW
Published:	2019
Online Access:	http://ndltd.ncl.edu.tw/handle/rgkh4f

id	ndltd-TW-107NTPU0396011
record_format	oai_dc
spelling	ndltd-TW-107NTPU03960112019-07-27T03:39:23Z http://ndltd.ncl.edu.tw/handle/rgkh4f Applying Deep Reinforcement Learning in Playing Simplified Scenarios of StarCraft II 應用深度強化學習進行星海爭霸II之迷你遊戲 SHEN, YI-XING 沈易星碩士國立臺北大學資訊管理研究所 107 One of the objectives of deep reinforcement learning (RL) is making an intelligent agent that is capable of making decisions or solving problems. Recently, artificial intelligence (AI) has won Atari games and defeated professional Go player, so AI researchers have being focused on more complex tasks – real-time strategy games. This study used StarCraft II artificial intelligence learning environment co-developed by DeepMind and Blizzard Entertainment to train and develop intelligent agents that are capable of playing StarCraft II. Our study uses the Asynchronous Advantage Actor-Critic (A3C) algorithm to train the RL model. We compare the action selection methods of two RL algorithms – the probability exploration method of the original A3C and the ε-greedy method. The results of the study show that the ε-greedy method learned a good game strategy faster. However, giving more training time, the probability exploration method does better eventually. We also propose a way to expedite the training process. We compare the difference between the growth rate of the cumulative rewards of the previous batch and this batch. We decrease the learning rate when the difference is larger than a predefined threshold. The results show that when the batch size is 20 and the decrement threshold is set to 30% to 40%, the agent can be trained more efficiently. Because it takes tremendous computing resources to learn the whole game of StarCraft II. Instead, we use the mini-games to train and learn RL models. The research environment established by this study can be utilized to training whole games in future studies. CHEN, TSUNG-TENG 陳宗天 2019 學位論文 ; thesis 63 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 國立臺北大學 === 資訊管理研究所 === 107 === One of the objectives of deep reinforcement learning (RL) is making an intelligent agent that is capable of making decisions or solving problems. Recently, artificial intelligence (AI) has won Atari games and defeated professional Go player, so AI researchers have being focused on more complex tasks – real-time strategy games. This study used StarCraft II artificial intelligence learning environment co-developed by DeepMind and Blizzard Entertainment to train and develop intelligent agents that are capable of playing StarCraft II. Our study uses the Asynchronous Advantage Actor-Critic (A3C) algorithm to train the RL model. We compare the action selection methods of two RL algorithms – the probability exploration method of the original A3C and the ε-greedy method. The results of the study show that the ε-greedy method learned a good game strategy faster. However, giving more training time, the probability exploration method does better eventually. We also propose a way to expedite the training process. We compare the difference between the growth rate of the cumulative rewards of the previous batch and this batch. We decrease the learning rate when the difference is larger than a predefined threshold. The results show that when the batch size is 20 and the decrement threshold is set to 30% to 40%, the agent can be trained more efficiently. Because it takes tremendous computing resources to learn the whole game of StarCraft II. Instead, we use the mini-games to train and learn RL models. The research environment established by this study can be utilized to training whole games in future studies.
author2	CHEN, TSUNG-TENG
author_facet	CHEN, TSUNG-TENG SHEN, YI-XING 沈易星
author	SHEN, YI-XING 沈易星
spellingShingle	SHEN, YI-XING 沈易星 Applying Deep Reinforcement Learning in Playing Simplified Scenarios of StarCraft II
author_sort	SHEN, YI-XING
title	Applying Deep Reinforcement Learning in Playing Simplified Scenarios of StarCraft II
title_short	Applying Deep Reinforcement Learning in Playing Simplified Scenarios of StarCraft II
title_full	Applying Deep Reinforcement Learning in Playing Simplified Scenarios of StarCraft II
title_fullStr	Applying Deep Reinforcement Learning in Playing Simplified Scenarios of StarCraft II
title_full_unstemmed	Applying Deep Reinforcement Learning in Playing Simplified Scenarios of StarCraft II
title_sort	applying deep reinforcement learning in playing simplified scenarios of starcraft ii
publishDate	2019
url	http://ndltd.ncl.edu.tw/handle/rgkh4f
work_keys_str_mv	AT shenyixing applyingdeepreinforcementlearninginplayingsimplifiedscenariosofstarcraftii AT chényìxīng applyingdeepreinforcementlearninginplayingsimplifiedscenariosofstarcraftii AT shenyixing yīngyòngshēndùqiánghuàxuéxíjìnxíngxīnghǎizhēngbàiizhīmínǐyóuxì AT chényìxīng yīngyòngshēndùqiánghuàxuéxíjìnxíngxīnghǎizhēngbàiizhīmínǐyóuxì
_version_	1719231565745094656

Applying Deep Reinforcement Learning in Playing Simplified Scenarios of StarCraft II

Similar Items