蒙地卡羅樹搜尋與深度卷積類神經網路之一般化結合

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.author	藍立呈	zh_TW
dc.contributor.author	吳毅成	zh_TW
dc.contributor.author	陳榮傑	zh_TW
dc.contributor.author	Lan, Li-Cheng	en_US
dc.contributor.author	Wu, I-Chen	en_US
dc.contributor.author	Chen, Rong-Jaye	en_US
dc.date.accessioned	2018-01-24T07:42:45Z	-
dc.date.available	2018-01-24T07:42:45Z	-
dc.date.issued	2017	en_US
dc.identifier.uri	http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070456513	en_US
dc.identifier.uri	http://hdl.handle.net/11536/142888	-
dc.description.abstract	DeepMind 為AlphaGo提出的一個搜尋演算法稱作APV-MCTS，它能非同步地結合Monte Carlo Tree Search (MCTS) 和Deep Convolutional Neural Networks (DCNN)。AlphaGo透過此演算法結合他們訓練的DCNN成為第一支成功擊敗圍棋人類職業棋士的圍棋AI程式。本篇主要是透過探討APV-MCTS的特性，並將其改成一個更一般化的演算法稱作GAPV-MCTS，以適用於更多不同的遊戲。我們以NoGo (一個圍棋的變種遊戲) 做為我們主要的實驗對象。在經過調整GAPV-MCTS裡的參數後，GAPV-MCTS在用同一組DCNN的情況下，相較於APV-MCTS可以多進步約220 ELO (勝率77%)。	zh_TW
dc.description.abstract	Asynchronous Policy and Value MCTS Algorithm (APV-MCTS) proposed by DeepMind is a searching algorithm used in AlphaGo that combines Monte Carlo Tree Search (MCTS) with Deep Convolutional Neural Networks (DCNN) asynchronously. With APV-MCTS and DCNN, AlphaGo successfully became the first Go AI program that defeated professional human Go players. In this thesis, we will discuss some issues of APV-MCTS, and propose General APV-MCTS (GAPV-MCTS), which is modified from APV-MCTS to improve AI programs of other games. We apply GAPV-MCTS to NoGo (a variation of Go). After tuning some parameters in GAPV-MCTS, it performs 220 ELO (77% winning rate) higher than APV-MCTS using the same DCNNs.	en_US
dc.language.iso	en_US	en_US
dc.subject	蒙地卡羅樹搜尋	zh_TW
dc.subject	類神經網路	zh_TW
dc.subject	深度卷積類神經網路	zh_TW
dc.subject	MCTS	en_US
dc.subject	DCNN	en_US
dc.subject	AlphaGo	en_US
dc.subject	APV-MCTS	en_US
dc.subject	GAPV-MCTS	en_US
dc.title	蒙地卡羅樹搜尋與深度卷積類神經網路之一般化結合	zh_TW
dc.title	A General Approach to Combining MCTS with DCNN	en_US
dc.type	Thesis	en_US
dc.contributor.department	網路工程研究所	zh_TW
顯示於類別：	畢業論文