SVM在基因選擇之研究

標題:	SVM在基因選擇之研究 A Study on Support Vector Machines in Gene Selection
作者:	薛瑛萱 Ying-Hsuan Hsueh 洪志真洪慧念 Jyh-Jen Horng Shiau Hui-Hien Hung 統計學研究所
關鍵字:	SVM;基因選擇;support vector machines;gene selection
公開日期:	2002
摘要:	微生物晶片資料通常包含不到100個腫瘤樣本及5000-10000個基因。這樣的問題稱為"大p,小n"的問題。也就是要解決的問題有很多的變數(基因)但個體數(腫瘤樣本)很少。我們在本文中回顧了一些統計學家對此類問題的處理。我們著重的方法是support vector machines (SVM), 將從模擬實驗去探討在SVM方法中, 決策函數(decision function) 中的基因比重不同對重要基因選擇(gene selection)的影響, 標準化(normalizattion)的重要性,以及決策函數(decision function)與核 (kernel)的關係。 Microarray datasets typically contain expression data on 5000 - 10000 genes for less than 100 samples. It presents a "large p, small n" problem, that is, to solve a statistical problem with a very large number of variables (genes) by using a small number of observations (cell samples). Some papers dealing with this problem are reviewed. Support vector machines (SVM) has been a popular method in microarray data analysis. This paper studies the following three issues. (1) How the weights of the genes in the decussion function affect the gene selection; (2) the importance of the data normalization; and (3) the relationship between the decussion function and the kernel function used in SVM.
URI:	http://140.113.39.130/cdrfb3/record/nctu/#NT910337006 http://hdl.handle.net/11536/70036
Appears in Collections:	Thesis