將蒙地卡羅樹狀搜尋應用於《星海爭霸》中的戰術決策

標題:	將蒙地卡羅樹狀搜尋應用於《星海爭霸》中的戰術決策 Applying Monte Carlo Tree Search for Tactical Decision-making in StarCraft
作者:	蔣承翰孫春在 Chiang, Cheng-Han Sun, Chuen-Tsai 多媒體工程研究所
關鍵字:	星海爭霸;即時戰略;蒙地卡羅樹搜尋;StarCraft;Real-time strategy;Monte Carlo tree search
公開日期:	2016
摘要:	本研究將以Monte Carlo tree search來處理即時戰略遊戲（real-time strategy）中的tactical decision-making，也就是操作一群單位和敵方的單位作戰。選定的遊戲為StarCraft，其在即時戰略遊戲中的地位，有如西洋棋在桌遊中的地位。之所以想嘗試MCTS在這個問題上的效果，是因為MCTS能夠以較少的專家知識，在龐大的狀態中使用類似統計採樣的方法，來找到好的選擇；同時，即時戰略遊戲強調即時性，這也讓能夠隨時停止搜尋的MCTS顯得更為適用。藉由UCT considering duration，這個因為加入了durative action而能夠處理即時遊戲的演算法，我們能夠推展出一套遊戲抽象化的方式，來做出一個供MCTS使用的模擬器。實驗首先找到MCTS適合的exploration constant和最大子節點數量，再分別交叉測試了幾種單位數量和單位種類，分別為4/8/16/32/50個單位，以及純粹遠距單位/純粹近戰單位/混合單位。實驗結果顯示，在evaluation function中，表現最好的是HPD，一個定義為雙方總生命值差值的函式，而非其他有加入傷害資訊的函式。加入更多單位資訊我認為是對的，但大概是加入的方式不對，效果才不盡理想。另外，MCTS在遠距單位上的表現比較好，對抗內建AI的勝率平均大於80%，而在近戰單位上的表現則是極差，勝率只有10-20%，很可能是因為，模擬器為了效能的顧慮，沒有考慮單位碰撞，而單位碰撞對於近戰策略的擬定至關重要，無法用這種方式估計。MCTS在單位過多的時候，效果也會大打折扣。這個結果對其他研究者來說，已經算是揭露了MCTS在即時戰略上的可能性，如果他們有更強大的運算資源，我想肯定會有更好的表現。對遊戲開發者來說，這個結果代表，他們可以藉著使用MCTS在自家開發的即時戰略遊戲上，來做出強度更高、更有趣的電腦玩家。 In this thesis, we apply Monte Carlo tree search for tactical decision-making in StarCraft, which is about controlling units to combat with opponent’s units in real-time strategy games. MCTS can use less expert knowledge to achieve high performance and adaptive plays. And, for real-time games, MCTS can stop searching at any time to return the current best move. Applying UCT considering duration, which considers durative moves in UCT, we can implement a simulator for MCTS with reasonable game abstraction. In experiments, we found the value for exploration constant and the amount of max children for nodes in tree first. Then we tested different amount of units and different kinds of units, including 4/8/16/32/50 units and pure ranged/pure melee/mixed units’ compositions. First, the results showed that the best evaluation function is HPD, which was defined as the difference of total hitpoints of both sides, rather than other functions considering units’ damage. It’s not wrong to consider more information of units, but perhaps it’s suitable to use them that way. Second, MCTS did more well on ranged units than melee units. Because we needed better performance of simulator, ignoring unit collisions seemed to prevent it from finding good strategy for melee units. And third, MCTS couldn’t handle too much units. This research prevails the applicability of MCTS to real-time strategy. If other researchers can afford more computational resources, it is certain that this method would show better performance and results. On the other hand, for game developers, they can use this method to implement a more interesting and better built-in AI of their RTS games.
URI:	http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070356627 http://hdl.handle.net/11536/138921
顯示於類別：	畢業論文