標題: 隨機森林模型效力評估
Evaluating the Effectiveness of Random Forest Model
作者: 陳時仲
Chen, Shi-zhong
洪慧念
Hong, Hui-Nian
統計學研究所
關鍵字: 隨機森林;Random Forest
公開日期: 2015
摘要: 隨機森林是一種在機器學習中熱門的演算法,它是由多棵決策樹所組成的模型(model),我們先生成指定的決策樹個數(ex:100),再由所有決策樹所估計的結果投票(類別型應變數)或取平均(連續型應變數)來做最後的結果預測。在R軟體的randomForest套件(package)中,隨機森林在使用上十分方便,只要決定決策樹的個數(ntry)和節點分枝的變數隨機選取的個數(mtry)就可以進行資料的分析,而其分析真實資料的結果(第3章)比一些統計模型還要好,且同時也有尋找重要變數的功能,是一個十分完整且方便的模型。
Random Forest is a popular machine learning algorithms. It is a decision tree model consists of multiple trees. First, we generate a specified number of tree (ex: 100), then we predict the final result by taking average of all the results (for continuous response) or by majority voting of the results (for categorical response). Random forests in R software package “randomForest” is very easy to use. As long as we choose the number of the decision tree (ntry) and the number of variables to be selected for node branching (mtry), then we can analyze the data by this model. Its analysis results of the real data (Chapter 3) are better than some of the statistical model. What’s more, our model also has the ability for finding important variables. Therefore, it is a very complete and convenient model.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT070252616
http://hdl.handle.net/11536/126248
Appears in Collections:Thesis