完整後設資料紀錄
DC 欄位語言
dc.contributor.author吳宏君zh_TW
dc.contributor.author吳毅成zh_TW
dc.contributor.authorWu, Hung-Chunen_US
dc.contributor.authorWu, I-Chenen_US
dc.date.accessioned2018-01-24T07:42:42Z-
dc.date.available2018-01-24T07:42:42Z-
dc.date.issued2017en_US
dc.identifier.urihttp://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070356169en_US
dc.identifier.urihttp://hdl.handle.net/11536/142814-
dc.description.abstract人工智慧過去在圍棋程式的領域發展多年,仍難以突破頂尖人類棋士的水準. 近年來隨著深度學習的發展, 機器對於圖形的pattern辨識與分類有突破性的提升. 同時也讓圍棋程式成功達到職業的強度. AlphaGo結合深度學習模型及強化式學習的概念, 讓圍棋AI學習判斷每一個盤面的價值, 克服了以往圍棋因為複雜度過高, 難以設計推算價值模型的問題, 大幅地提升程式強度. 在本論文中, 我們設計了一套強化式深度學習的架構, 配合策略梯度(policy gradient)的更新方式, 訓練以下棋強度為取向的策略網路(policy network). 我們也透過增加模型的深度及寬度, 並結合剩餘網路(residual network), 達到了更高的強度.zh_TW
dc.description.abstractArtificial Intelligence in Go had been developing for several years, yet it could hardly compete with professional players in the past. Recently, with the breakthrough of deep learning, the ability of machine for classification and pattern recognition has been significantly improved, which makes the Go AI program competitive with the professional player. AlphaGo [Huang et al. 2016] combines the concept of deep learning and reinforcement learning in order to teach the program to evaluate game positions. This dramatically boosts the strength of the program. In this paper, we propose a deep reinforcement learning (DRL) framework where the policy network is trained based on policy gradient. We also apply residual network in order to make the model deeper. The result shows that the network can beat Pachi, a MCTS-based program, with 92.20% win-rate, the highest win-rate that have ever reported.en_US
dc.language.isoen_USen_US
dc.subject深度學習zh_TW
dc.subject深度強化式學習zh_TW
dc.subject圍棋zh_TW
dc.subjectdeep learningen_US
dc.subjectdeep reinforcement learningen_US
dc.subjectconvolutional neural networken_US
dc.subjectGoen_US
dc.subjectAIen_US
dc.title運用於圍棋之深度強化式學習設計zh_TW
dc.titleDesign of Deep Reinforcement Learning for Playing Goen_US
dc.typeThesisen_US
dc.contributor.department資訊科學與工程研究所zh_TW
顯示於類別:畢業論文