完整後設資料紀錄
DC 欄位語言
dc.contributor.author張寶文en_US
dc.contributor.authorBao-wen Changen_US
dc.contributor.author洪志真en_US
dc.contributor.author洪慧念en_US
dc.contributor.authorJyh-Jen Horng Shiauen_US
dc.contributor.authorHui-Nien Hungen_US
dc.date.accessioned2014-12-12T02:08:36Z-
dc.date.available2014-12-12T02:08:36Z-
dc.date.issued2003en_US
dc.identifier.urihttp://140.113.39.130/cdrfb3/record/nctu/#GT009126506en_US
dc.identifier.urihttp://hdl.handle.net/11536/55423-
dc.description.abstract微陣列資料集通常包含數千個基因,但只有數十個樣本。這種所謂“大p (基因),小n (樣本)”的特性會為統計分析帶來一些困難。基因選取是處理這類問題的一種典型方法。其中,Filters和wrappers是兩種常用的基因選取方法。Filters利用一個排序準則來判斷一個基因是否被選取;因此,這種方法在計算上非常快速,但可能選到高度相關的基因而造成冗贅。另一方面,wrappers通常能夠選取一個不冗贅的基因子集但卻需要龐大的運算量。這篇研究中採用上述二種方法的組合。我們先根據一個排序準則過濾掉對分類無益的基因,再利用K-means分群演算法對其餘基因分群以避免冗贅。然後,應用Guyon et al. (2002) 所提出的SVM-RFE基因選取方法於自每群選出的候選基因。最後,我們利用所提出的方法來分析三個常見的癌症資料集。其結果顯示,當選出的基因數目少時,我們的方法表現地比所討論的三種filters好。zh_TW
dc.description.abstractA microarray dataset contains thousands of genes but only tens of subjects in general. This so-called “large (gene), small (subject)” feature brings about some difficulties to statistical analysis. Gene selection is a typical approach to deal with this problem. There are two conventional gene selection methods, filters and wrappers. Filters judge whether a gene should be selected based on a ranking criterion; therefore, they are very fast in computation but might select highly correlated genes that give rise to redundancy. On the other hand, wrappers usually select a small set of non-redundant genes but require extensive computation. A combination of these two methods is adopted in this study. We first filter out irrelevant genes according a ranking criterion and then group the rest to avoid redundancy via K-means clustering algorithm. Then, the SVM-RFE gene selection method proposed by Guyon et al. (2002) is applied to a list of candidate genes selected from each cluster. Three popular cancer data sets are analyzed by means of the proposed method. The results show that the proposed method performs better than three filter methods under study when the number of selected genes is small.en_US
dc.language.isoen_USen_US
dc.subject基因微陣列zh_TW
dc.subject基因選取zh_TW
dc.subject群集分析zh_TW
dc.subjectmicroarrayen_US
dc.subjectgene selectionen_US
dc.subjectclusteringen_US
dc.title在微陣列資料上利用基因分群以減少冗贅之基因選取方法zh_TW
dc.titleRedundancy-Reducing Feature Selection from Microarray Data Based on Gene-Groupingen_US
dc.typeThesisen_US
dc.contributor.department統計學研究所zh_TW
顯示於類別:畢業論文


文件中的檔案:

  1. 650601.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。