Proximal Policy Optimization in StarCraft

Deep reinforcement learning is an area of research that has blossomed tremendously in recent years and has shown remarkable potential in computer games. Real-time strategy game has become an important field of artificial intelligence in game for several years. This paper is about to introduce a kind...

Full description

Bibliographic Details
Main Author: Liu, Yuefan
Other Authors: Zhong, Xiangnan
Format: Others
Language:English
Published: University of North Texas 2019
Subjects:
Online Access:https://digital.library.unt.edu/ark:/67531/metadc1505267/
id ndltd-unt.edu-info-ark-67531-metadc1505267
record_format oai_dc
spelling ndltd-unt.edu-info-ark-67531-metadc15052672021-07-22T05:23:57Z Proximal Policy Optimization in StarCraft Liu, Yuefan Proximal Policy Optimization Blood War Application Programming Interface Reinforcement Learning StarCraft Deep reinforcement learning is an area of research that has blossomed tremendously in recent years and has shown remarkable potential in computer games. Real-time strategy game has become an important field of artificial intelligence in game for several years. This paper is about to introduce a kind of algorithm that used to train agents to fight against computer bots. Not only because games are excellent tools to test deep reinforcement learning algorithms for their valuable insight into how well an algorithm can perform in isolated environments without the real-life consequences, but also real-time strategy games are a very complex genre that challenges artificial intelligence agents in both short-term or long-term planning. In this paper, we introduce some history of deep learning and reinforcement learning. Then we combine them with StarCraft. PPO is the algorithm which have some of the benefits of trust region policy optimization (TRPO), but it is much simpler to implement, more general for environment, and have better sample complexity. The StarCraft environment: Blood War Application Programming Interface (BWAPI) is open source to test. The results show that PPO can work well in BWAPI and train units to defeat the opponents. The algorithm presented in the thesis is corroborated by experiments. University of North Texas Zhong, Xiangnan Li, Xinrong Yang, Tao 2019-05 Thesis or Dissertation vii, 70 pages Text local-cont-no: submission_1613 https://digital.library.unt.edu/ark:/67531/metadc1505267/ ark: ark:/67531/metadc1505267 English Use restricted to UNT Community Liu, Yuefan Copyright Copyright is held by the author, unless otherwise noted. All rights Reserved.
collection NDLTD
language English
format Others
sources NDLTD
topic Proximal Policy Optimization
Blood War Application Programming Interface
Reinforcement Learning
StarCraft
spellingShingle Proximal Policy Optimization
Blood War Application Programming Interface
Reinforcement Learning
StarCraft
Liu, Yuefan
Proximal Policy Optimization in StarCraft
description Deep reinforcement learning is an area of research that has blossomed tremendously in recent years and has shown remarkable potential in computer games. Real-time strategy game has become an important field of artificial intelligence in game for several years. This paper is about to introduce a kind of algorithm that used to train agents to fight against computer bots. Not only because games are excellent tools to test deep reinforcement learning algorithms for their valuable insight into how well an algorithm can perform in isolated environments without the real-life consequences, but also real-time strategy games are a very complex genre that challenges artificial intelligence agents in both short-term or long-term planning. In this paper, we introduce some history of deep learning and reinforcement learning. Then we combine them with StarCraft. PPO is the algorithm which have some of the benefits of trust region policy optimization (TRPO), but it is much simpler to implement, more general for environment, and have better sample complexity. The StarCraft environment: Blood War Application Programming Interface (BWAPI) is open source to test. The results show that PPO can work well in BWAPI and train units to defeat the opponents. The algorithm presented in the thesis is corroborated by experiments.
author2 Zhong, Xiangnan
author_facet Zhong, Xiangnan
Liu, Yuefan
author Liu, Yuefan
author_sort Liu, Yuefan
title Proximal Policy Optimization in StarCraft
title_short Proximal Policy Optimization in StarCraft
title_full Proximal Policy Optimization in StarCraft
title_fullStr Proximal Policy Optimization in StarCraft
title_full_unstemmed Proximal Policy Optimization in StarCraft
title_sort proximal policy optimization in starcraft
publisher University of North Texas
publishDate 2019
url https://digital.library.unt.edu/ark:/67531/metadc1505267/
work_keys_str_mv AT liuyuefan proximalpolicyoptimizationinstarcraft
_version_ 1719417574149586944