AlphaZero for a Non-deterministic Game

doi:10.1109/TAAI.2018.00034

標題:	AlphaZero for a Non-deterministic Game
作者:	Hsueh, Chu-Hsuan Wu, I-Chen Chen, Jr-Chang Hsu, Tsan-sheng 資訊工程學系 Department of Computer Science
關鍵字:	AlphaZero;non-deterministic game;Chinese dark chess;theoretical value
公開日期:	1-一月-2018
摘要:	The AlphaZero algorithm, developed by DeepMind, achieved superhuman levels of play in the games of chess, shogi, and Go, by learning without domain-specific knowledge except game rules. This paper investigates whether the algorithm can also learn theoretical values and optimal plays for non-deterministic games. Since the theoretical values of such games are expected win rates, not a simple win, loss, or draw, it is worthy investigating the ability of the AlphaZero algorithm to approximate expected win rates of positions. This paper also studies how the algorithm is influenced by a set of hyper-parameters. The tested non-deterministic game is a reduced and solved version of Chinese dark chess (CDC), called 2x4 CDC. The experiments show that the AlphaZero algorithm converges nearly to the theoretical values and the optimal plays in many of the settings of the hyper-parameters. To our knowledge, this is the first research paper that applies the AlphaZero algorithm to non-deterministic games.
URI:	http://dx.doi.org/10.1109/TAAI.2018.00034 http://hdl.handle.net/11536/151040
ISSN:	2376-6816
DOI:	10.1109/TAAI.2018.00034
期刊:	2018 CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI)
起始頁:	116
結束頁:	121
顯示於類別：	會議論文