基於中國餐廳過程之在線學習方法

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.author	蔡宗勳	en_US
dc.contributor.author	Tsai, Tsung-Hsun	en_US
dc.contributor.author	李嘉晃	en_US
dc.contributor.author	Lee, Chia-Hoang	en_US
dc.date.accessioned	2015-11-26T01:04:28Z	-
dc.date.available	2015-11-26T01:04:28Z	-
dc.date.issued	2013	en_US
dc.identifier.uri	http://140.113.39.130/cdrfb3/record/nctu/#GT070056109	en_US
dc.identifier.uri	http://hdl.handle.net/11536/73258	-
dc.description.abstract	目前各領域的資料已經漸漸成長為巨量資料，許多傳統的機器學習方法已經無法處理這些巨量資料。在線學習方法具備動態模型更新特性且一次只需將一筆資料載入記憶體做處理，可即時處理大量資料，因此為解決巨量資料的一個方法。此外，處理巨量資料時，要在訓練模型之前就事先決定參數是一件困難的事，往往只能透過專家經驗或實驗測試以得到模型參數；貝氏無母數模型提供了一個使群數參數能夠依資料特性自行決定的方法，適合用於巨量資料上。中國餐廳過程早期是機率論上用來描述空間中一群切割之分佈的隨機過程，若將其對應至從Dirichlet Process取樣的一個過程，則可以從一個分佈取樣出多組參數，每一組參數又分別代表一個分佈。本論文提出的方法為將在線學習的概念擴展於中國餐廳過程上，並利用在線學習過程中的每一筆訓練資料來影響機率模型中參數的估計，進而建立出整個模型。在實驗中，當資料量大時，我們提出的Online CRP 不僅在分類的效能上能夠達到監督式學習方法的標準，且在執行時間也比很多方法快速，驗證本方法可準確並有效率的處理巨量資料問題。	zh_TW
dc.description.abstract	The rise of big data provides an opportunity for the enterprises to use data analytics to gain competitive advantage, but it also brings challenges to process, manage and analyze the large data sets. One typical challenge is to process large volumes of streaming data in real time. Online machine learning allows the model to learn one instance at a time, in which the model is updated according to the prediction result and the true label of the instance. Compared with batch machine learning algorithms, online machine learning is more appropriate to process streaming data, and it can adjust learning model as receiving more new unknown data. Besides online processing, parameter selection is an important task in machine learning in dealing with model selection, but the task is generally achieved by heuristic rules or cross-validation technique with a validation set. In big data process, parameter should be adapted as with data rather than a fixed one. Nonparametric Bayesian model provides a means for the model to adapt parameters with the data. This study proposes an online Chinese Restaurant Process algorithm, which extended from Chinese Restaurant Process (CRP). The proposed algorithm is an online and nonparametric parameter algorithm, so it can process streaming data efficiently and the parameters are adapted with the data. Compared with CRP, the proposed algorithm is an online algorithm, in which we use regret theory to design a new prior knowledge and likelihood function based on the consistence between the real label information and prediction result. In the experiments, the proposed algorithm works well in large data set, and generally outperform the other online machine learning algorithms.	en_US
dc.language.iso	zh_TW	en_US
dc.subject	在線學習	zh_TW
dc.subject	中國餐廳過程	zh_TW
dc.subject	分類	zh_TW
dc.subject	無母數模型	zh_TW
dc.subject	Online Learning	en_US
dc.subject	Chinese Restaurant Process	en_US
dc.subject	Classification	en_US
dc.subject	Non-parametric	en_US
dc.title	基於中國餐廳過程之在線學習方法	zh_TW
dc.title	Online Chinese Restaurant Process	en_US
dc.type	Thesis	en_US
dc.contributor.department	資訊科學與工程研究所	zh_TW
顯示於類別：	畢業論文

文件中的檔案：

610901.pdf

若為 zip 檔案，請下載檔案解壓縮後，用瀏覽器開啟資料夾中的 index.html 瀏覽全文。