AlphaZero to Alpha Hero : A pre-study on Additional Tree Sampling within Self-Play Reinforcement Learning

In self-play reinforcement learning an agent plays games against itself and with the help of hindsight and retrospection improves its policy over time. Using this premise, AlphaZero famously managed to become the strongest known Go, Shogi, and Chess entity by training a deep neural network from data...

Full description

Bibliographic Details
Main Authors: Carlsson, Fredrik, Öhman, Joey
Format: Others
Language:English
Published: KTH, Skolan för elektroteknik och datavetenskap (EECS) 2019
Subjects:
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-259200