標題: SVM在基因選擇之研究
A Study on Support Vector Machines in Gene Selection
作者: 薛瑛萱
Ying-Hsuan Hsueh
洪志真
洪慧念
Jyh-Jen Horng Shiau
Hui-Hien Hung
統計學研究所
關鍵字: SVM;基因選擇;support vector machines;gene selection
公開日期: 2002
摘要: 微生物晶片資料通常包含不到100個腫瘤樣本及5000-10000個基因。這樣的問題稱為"大p,小n"的問題。 也就是要解決的問題有很多的變數(基因)但個體數(腫瘤樣本)很少。 我們在本文中回顧了一些統計學家對此類問題的處理。 我們著重的方法是support vector machines (SVM), 將從模擬實驗去探討在SVM方法中, 決策函數(decision function) 中的基因比重不同對重要基因選擇(gene selection)的影響, 標準化(normalizattion)的重要性,以及決策函數(decision function)與核 (kernel)的關係。
Microarray datasets typically contain expression data on 5000 - 10000 genes for less than 100 samples. It presents a "large p, small n" problem, that is, to solve a statistical problem with a very large number of variables (genes) by using a small number of observations (cell samples). Some papers dealing with this problem are reviewed. Support vector machines (SVM) has been a popular method in microarray data analysis. This paper studies the following three issues. (1) How the weights of the genes in the decussion function affect the gene selection; (2) the importance of the data normalization; and (3) the relationship between the decussion function and the kernel function used in SVM.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT910337006
http://hdl.handle.net/11536/70036
顯示於類別:畢業論文