Proximal Policy Optimization in StarCraft

Deep reinforcement learning is an area of research that has blossomed tremendously in recent years and has shown remarkable potential in computer games. Real-time strategy game has become an important field of artificial intelligence in game for several years. This paper is about to introduce a kind...

Full description

Bibliographic Details
Main Author:	Liu, Yuefan
Other Authors:	Zhong, Xiangnan
Format:	Others
Language:	English
Published:	University of North Texas 2019
Subjects:	Proximal Policy Optimization Blood War Application Programming Interface Reinforcement Learning StarCraft
Online Access:	https://digital.library.unt.edu/ark:/67531/metadc1505267/

id	ndltd-unt.edu-info-ark-67531-metadc1505267
record_format	oai_dc
spelling	ndltd-unt.edu-info-ark-67531-metadc15052672021-07-22T05:23:57Z Proximal Policy Optimization in StarCraft Liu, Yuefan Proximal Policy Optimization Blood War Application Programming Interface Reinforcement Learning StarCraft Deep reinforcement learning is an area of research that has blossomed tremendously in recent years and has shown remarkable potential in computer games. Real-time strategy game has become an important field of artificial intelligence in game for several years. This paper is about to introduce a kind of algorithm that used to train agents to fight against computer bots. Not only because games are excellent tools to test deep reinforcement learning algorithms for their valuable insight into how well an algorithm can perform in isolated environments without the real-life consequences, but also real-time strategy games are a very complex genre that challenges artificial intelligence agents in both short-term or long-term planning. In this paper, we introduce some history of deep learning and reinforcement learning. Then we combine them with StarCraft. PPO is the algorithm which have some of the benefits of trust region policy optimization (TRPO), but it is much simpler to implement, more general for environment, and have better sample complexity. The StarCraft environment: Blood War Application Programming Interface (BWAPI) is open source to test. The results show that PPO can work well in BWAPI and train units to defeat the opponents. The algorithm presented in the thesis is corroborated by experiments. University of North Texas Zhong, Xiangnan Li, Xinrong Yang, Tao 2019-05 Thesis or Dissertation vii, 70 pages Text local-cont-no: submission_1613 https://digital.library.unt.edu/ark:/67531/metadc1505267/ ark: ark:/67531/metadc1505267 English Use restricted to UNT Community Liu, Yuefan Copyright Copyright is held by the author, unless otherwise noted. All rights Reserved.
collection	NDLTD
language	English
format	Others
sources	NDLTD
topic	Proximal Policy Optimization Blood War Application Programming Interface Reinforcement Learning StarCraft
spellingShingle	Proximal Policy Optimization Blood War Application Programming Interface Reinforcement Learning StarCraft Liu, Yuefan Proximal Policy Optimization in StarCraft
description	Deep reinforcement learning is an area of research that has blossomed tremendously in recent years and has shown remarkable potential in computer games. Real-time strategy game has become an important field of artificial intelligence in game for several years. This paper is about to introduce a kind of algorithm that used to train agents to fight against computer bots. Not only because games are excellent tools to test deep reinforcement learning algorithms for their valuable insight into how well an algorithm can perform in isolated environments without the real-life consequences, but also real-time strategy games are a very complex genre that challenges artificial intelligence agents in both short-term or long-term planning. In this paper, we introduce some history of deep learning and reinforcement learning. Then we combine them with StarCraft. PPO is the algorithm which have some of the benefits of trust region policy optimization (TRPO), but it is much simpler to implement, more general for environment, and have better sample complexity. The StarCraft environment: Blood War Application Programming Interface (BWAPI) is open source to test. The results show that PPO can work well in BWAPI and train units to defeat the opponents. The algorithm presented in the thesis is corroborated by experiments.
author2	Zhong, Xiangnan
author_facet	Zhong, Xiangnan Liu, Yuefan
author	Liu, Yuefan
author_sort	Liu, Yuefan
title	Proximal Policy Optimization in StarCraft
title_short	Proximal Policy Optimization in StarCraft
title_full	Proximal Policy Optimization in StarCraft
title_fullStr	Proximal Policy Optimization in StarCraft
title_full_unstemmed	Proximal Policy Optimization in StarCraft
title_sort	proximal policy optimization in starcraft
publisher	University of North Texas
publishDate	2019
url	https://digital.library.unt.edu/ark:/67531/metadc1505267/
work_keys_str_mv	AT liuyuefan proximalpolicyoptimizationinstarcraft
_version_	1719417574149586944

Proximal Policy Optimization in StarCraft

Similar Items