Applying Deep Reinforcement Learning in Playing Simplified Scenarios of StarCraft II
碩士 === 國立臺北大學 === 資訊管理研究所 === 107 === One of the objectives of deep reinforcement learning (RL) is making an intelligent agent that is capable of making decisions or solving problems. Recently, artificial intelligence (AI) has won Atari games and defeated professional Go player, so AI researchers ha...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2019
|
Online Access: | http://ndltd.ncl.edu.tw/handle/rgkh4f |
id |
ndltd-TW-107NTPU0396011 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-107NTPU03960112019-07-27T03:39:23Z http://ndltd.ncl.edu.tw/handle/rgkh4f Applying Deep Reinforcement Learning in Playing Simplified Scenarios of StarCraft II 應用深度強化學習進行星海爭霸II之迷你遊戲 SHEN, YI-XING 沈易星 碩士 國立臺北大學 資訊管理研究所 107 One of the objectives of deep reinforcement learning (RL) is making an intelligent agent that is capable of making decisions or solving problems. Recently, artificial intelligence (AI) has won Atari games and defeated professional Go player, so AI researchers have being focused on more complex tasks – real-time strategy games. This study used StarCraft II artificial intelligence learning environment co-developed by DeepMind and Blizzard Entertainment to train and develop intelligent agents that are capable of playing StarCraft II. Our study uses the Asynchronous Advantage Actor-Critic (A3C) algorithm to train the RL model. We compare the action selection methods of two RL algorithms – the probability exploration method of the original A3C and the ε-greedy method. The results of the study show that the ε-greedy method learned a good game strategy faster. However, giving more training time, the probability exploration method does better eventually. We also propose a way to expedite the training process. We compare the difference between the growth rate of the cumulative rewards of the previous batch and this batch. We decrease the learning rate when the difference is larger than a predefined threshold. The results show that when the batch size is 20 and the decrement threshold is set to 30% to 40%, the agent can be trained more efficiently. Because it takes tremendous computing resources to learn the whole game of StarCraft II. Instead, we use the mini-games to train and learn RL models. The research environment established by this study can be utilized to training whole games in future studies. CHEN, TSUNG-TENG 陳宗天 2019 學位論文 ; thesis 63 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺北大學 === 資訊管理研究所 === 107 === One of the objectives of deep reinforcement learning (RL) is making an intelligent agent that is capable of making decisions or solving problems. Recently, artificial intelligence (AI) has won Atari games and defeated professional Go player, so AI researchers have being focused on more complex tasks – real-time strategy games. This study used StarCraft II artificial intelligence learning environment co-developed by DeepMind and Blizzard Entertainment to train and develop intelligent agents that are capable of playing StarCraft II.
Our study uses the Asynchronous Advantage Actor-Critic (A3C) algorithm to train the RL model. We compare the action selection methods of two RL algorithms – the probability exploration method of the original A3C and the ε-greedy method. The results of the study show that the ε-greedy method learned a good game strategy faster. However, giving more training time, the probability exploration method does better eventually. We also propose a way to expedite the training process. We compare the difference between the growth rate of the cumulative rewards of the previous batch and this batch. We decrease the learning rate when the difference is larger than a predefined threshold. The results show that when the batch size is 20 and the decrement threshold is set to 30% to 40%, the agent can be trained more efficiently.
Because it takes tremendous computing resources to learn the whole game of StarCraft II. Instead, we use the mini-games to train and learn RL models. The research environment established by this study can be utilized to training whole games in future studies.
|
author2 |
CHEN, TSUNG-TENG |
author_facet |
CHEN, TSUNG-TENG SHEN, YI-XING 沈易星 |
author |
SHEN, YI-XING 沈易星 |
spellingShingle |
SHEN, YI-XING 沈易星 Applying Deep Reinforcement Learning in Playing Simplified Scenarios of StarCraft II |
author_sort |
SHEN, YI-XING |
title |
Applying Deep Reinforcement Learning in Playing Simplified Scenarios of StarCraft II |
title_short |
Applying Deep Reinforcement Learning in Playing Simplified Scenarios of StarCraft II |
title_full |
Applying Deep Reinforcement Learning in Playing Simplified Scenarios of StarCraft II |
title_fullStr |
Applying Deep Reinforcement Learning in Playing Simplified Scenarios of StarCraft II |
title_full_unstemmed |
Applying Deep Reinforcement Learning in Playing Simplified Scenarios of StarCraft II |
title_sort |
applying deep reinforcement learning in playing simplified scenarios of starcraft ii |
publishDate |
2019 |
url |
http://ndltd.ncl.edu.tw/handle/rgkh4f |
work_keys_str_mv |
AT shenyixing applyingdeepreinforcementlearninginplayingsimplifiedscenariosofstarcraftii AT chényìxīng applyingdeepreinforcementlearninginplayingsimplifiedscenariosofstarcraftii AT shenyixing yīngyòngshēndùqiánghuàxuéxíjìnxíngxīnghǎizhēngbàiizhīmínǐyóuxì AT chényìxīng yīngyòngshēndùqiánghuàxuéxíjìnxíngxīnghǎizhēngbàiizhīmínǐyóuxì |
_version_ |
1719231565745094656 |