啟發式多臂老虎機之最佳解辨識

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.author	葉騉豪	zh_TW
dc.contributor.author	吳毅成	zh_TW
dc.contributor.author	Yeh, Kun-Hao	en_US
dc.date.accessioned	2018-01-24T07:37:59Z	-
dc.date.available	2018-01-24T07:37:59Z	-
dc.date.issued	2016	en_US
dc.identifier.uri	http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070356038	en_US
dc.identifier.uri	http://hdl.handle.net/11536/139410	-
dc.description.abstract	這篇論文提出多臂老虎機的一種變形問題，名為啟發式多臂老虎機。本篇論文假設啟發之存在且能用於幫助最佳解之辨識。我們針對此問題提出了一個推薦最佳解之模型。根據此模型，我們更提出動態的資源投資演算法並延伸成採取最小最大值更新之樹狀搜尋演算法。此二演算法皆無需任何參數且能在任意時間停止並推薦最佳解。我們將此二演算法與現有演算法在本篇提出的問題以及隨機生成之遊戲樹上做實驗。結果顯示本篇提出之方法在有啟發評估的幫助下，辨識出最佳解的表現比現有的演算法更好。	zh_TW
dc.description.abstract	This paper first presents a variant of the Multi-Armed Bandit (MAB) problem, called heuristic-based MAB (H-MAB) problem, where heuristics are available to help identify the best arm. We then propose a recommendation model for H-MAB. Based on H-MAB and the model, we propose a dynamic budget allocation algorithm for best arm identification and extend it to a tree search algorithm with minimax update. Both algorithms are anytime and parameter free. We corroborate the proposed model and algorithms with experiments on both H-MABs and random game trees, and demonstrate that these algorithms outperform state-of-the-art algorithms in the aspect of the probability of identifying the best arms.	en_US
dc.language.iso	en_US	en_US
dc.subject	多臂老虎機	zh_TW
dc.subject	啟發	zh_TW
dc.subject	heuristic	en_US
dc.subject	multi-armed bandits	en_US
dc.title	啟發式多臂老虎機之最佳解辨識	zh_TW
dc.title	Best Arm Identification for Heuristic-Based Multi-Armed Bandits	en_US
dc.type	Thesis	en_US
dc.contributor.department	資訊科學與工程研究所	zh_TW
顯示於類別：	畢業論文