標題: | SVM在基因選擇之研究 A Study on Support Vector Machines in Gene Selection |
作者: | 薛瑛萱 Ying-Hsuan Hsueh 洪志真 洪慧念 Jyh-Jen Horng Shiau Hui-Hien Hung 統計學研究所 |
關鍵字: | SVM;基因選擇;support vector machines;gene selection |
公開日期: | 2002 |
摘要: | 微生物晶片資料通常包含不到100個腫瘤樣本及5000-10000個基因。這樣的問題稱為"大p,小n"的問題。
也就是要解決的問題有很多的變數(基因)但個體數(腫瘤樣本)很少。
我們在本文中回顧了一些統計學家對此類問題的處理。
我們著重的方法是support vector machines (SVM), 將從模擬實驗去探討在SVM方法中,
決策函數(decision function) 中的基因比重不同對重要基因選擇(gene selection)的影響,
標準化(normalizattion)的重要性,以及決策函數(decision function)與核 (kernel)的關係。 Microarray datasets typically contain expression data on 5000 - 10000 genes for less than 100 samples. It presents a "large p, small n" problem, that is, to solve a statistical problem with a very large number of variables (genes) by using a small number of observations (cell samples). Some papers dealing with this problem are reviewed. Support vector machines (SVM) has been a popular method in microarray data analysis. This paper studies the following three issues. (1) How the weights of the genes in the decussion function affect the gene selection; (2) the importance of the data normalization; and (3) the relationship between the decussion function and the kernel function used in SVM. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#NT910337006 http://hdl.handle.net/11536/70036 |
Appears in Collections: | Thesis |