Strength Adjustment and Assessment for MCTS-Based Programs

碩士 === 國立交通大學 === 資訊科學與工程研究所 === 107 === This paper proposes an approach to strength adjustment and assessment for MCTS-based game-playing programs. We modify an existing softmax policy with a strength index to choose moves. The most important modification is a mechanism which filters low quality mo...

Full description

Bibliographic Details
Main Authors: Liu, An-Jen, 劉安仁
Other Authors: Wu, I-Chen
Format: Others
Language:en_US
Published: 2019
Online Access:http://ndltd.ncl.edu.tw/handle/br8e7d
id ndltd-TW-107NCTU5394120
record_format oai_dc
spelling ndltd-TW-107NCTU53941202019-11-26T05:16:53Z http://ndltd.ncl.edu.tw/handle/br8e7d Strength Adjustment and Assessment for MCTS-Based Programs 以MCTS為基礎的遊戲程式之強度調整及棋力評估 Liu, An-Jen 劉安仁 碩士 國立交通大學 資訊科學與工程研究所 107 This paper proposes an approach to strength adjustment and assessment for MCTS-based game-playing programs. We modify an existing softmax policy with a strength index to choose moves. The most important modification is a mechanism which filters low quality moves by excluding those that have a lower simulation count than a pre-defined threshold ratio of the maximum simulation count. Through theoretical analysis, we show that the adjusted policy is guaranteed to choose moves exceeding a lower bound in strength by using a threshold ratio. The approach is applied to the Go programs ELF OpenGo and CGI. Experiment results show that the strength index is highly correlated to the empirical strength; namely, given a threshold ratio 0.1 as an example, the strength index is linearly related to the Elo rating with regression error 47.95 Elo. With an index value between $ m 2$, we can cover a strength range of about 800 Elo ratings. We then present three methods to adjust strength and assess opponents’ strengths dynamically. The strength adjustment and assessment methods were also tested in real-world scenarios with human players, ranging from professionals (strongest) to kyu rank amateurs (weakest). In addition to offering a wide range of strengths, our method can consistently play at the same strength given the same strength index, allowing players to choose appropriate opponents with ease. In regards to assessment, our method can automatically adjust its strength to its opponent's and in turn measure the opponent's strength accurately within 15 games. To our knowledge, this result is state-of-the-art in terms of the range of strengths in Elo rating while maintaining a controllable relationship between the strength and a strength index. Wu, I-Chen 吳毅成 2019 學位論文 ; thesis 37 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立交通大學 === 資訊科學與工程研究所 === 107 === This paper proposes an approach to strength adjustment and assessment for MCTS-based game-playing programs. We modify an existing softmax policy with a strength index to choose moves. The most important modification is a mechanism which filters low quality moves by excluding those that have a lower simulation count than a pre-defined threshold ratio of the maximum simulation count. Through theoretical analysis, we show that the adjusted policy is guaranteed to choose moves exceeding a lower bound in strength by using a threshold ratio. The approach is applied to the Go programs ELF OpenGo and CGI. Experiment results show that the strength index is highly correlated to the empirical strength; namely, given a threshold ratio 0.1 as an example, the strength index is linearly related to the Elo rating with regression error 47.95 Elo. With an index value between $ m 2$, we can cover a strength range of about 800 Elo ratings. We then present three methods to adjust strength and assess opponents’ strengths dynamically. The strength adjustment and assessment methods were also tested in real-world scenarios with human players, ranging from professionals (strongest) to kyu rank amateurs (weakest). In addition to offering a wide range of strengths, our method can consistently play at the same strength given the same strength index, allowing players to choose appropriate opponents with ease. In regards to assessment, our method can automatically adjust its strength to its opponent's and in turn measure the opponent's strength accurately within 15 games. To our knowledge, this result is state-of-the-art in terms of the range of strengths in Elo rating while maintaining a controllable relationship between the strength and a strength index.
author2 Wu, I-Chen
author_facet Wu, I-Chen
Liu, An-Jen
劉安仁
author Liu, An-Jen
劉安仁
spellingShingle Liu, An-Jen
劉安仁
Strength Adjustment and Assessment for MCTS-Based Programs
author_sort Liu, An-Jen
title Strength Adjustment and Assessment for MCTS-Based Programs
title_short Strength Adjustment and Assessment for MCTS-Based Programs
title_full Strength Adjustment and Assessment for MCTS-Based Programs
title_fullStr Strength Adjustment and Assessment for MCTS-Based Programs
title_full_unstemmed Strength Adjustment and Assessment for MCTS-Based Programs
title_sort strength adjustment and assessment for mcts-based programs
publishDate 2019
url http://ndltd.ncl.edu.tw/handle/br8e7d
work_keys_str_mv AT liuanjen strengthadjustmentandassessmentformctsbasedprograms
AT liúānrén strengthadjustmentandassessmentformctsbasedprograms
AT liuanjen yǐmctswèijīchǔdeyóuxìchéngshìzhīqiángdùdiàozhěngjíqílìpínggū
AT liúānrén yǐmctswèijīchǔdeyóuxìchéngshìzhīqiángdùdiàozhěngjíqílìpínggū
_version_ 1719295880488550400