標題: 使用 EMT 相關基因預測肺癌轉移
EMT-related genes for prediction of lung cancer
作者: 黃韻如
Huang,Yun-Ju
何信瑩
Ho, Shinn-Ying
分子醫學與生物工程研究所
關鍵字: 肺癌;lung cancer
公開日期: 2011
摘要: 肺癌是造成世界上癌症死亡的主要原因之一, 以癌症死亡病例來看肺癌佔 了第一位,高達 29%。此外,癌症轉移跟對藥物的抗性是主要治療失敗的原因之 一。因此,了解轉移及藥物抗性的機制對於治療肺癌已被視為一個重要的研究方 向。以 DNA 及 RNA 微陣列分析來理解或找出其重要致病機制,並且有助於發現新的腫瘤標誌物。但是,從生物複雜的調控機制中找出與轉移相關的因子並不是一件容易的事。 上皮細胞轉型成間質細胞 (Epithelial-mesenchymal transitions, EMT) 提供一個新的基礎在了解癌化的過程,我們研究的標的物為針對肺癌中與 EMT 相關之因子,來預測肺癌病人的轉移及預後。 我們使用肺癌病人的全基因組表達譜基因芯片資料,作為我們進行特徵選取 的資料庫,在此研究中我們利用 IPA 資料庫的功能分類註解,找出與 EMT 相關之基因,再使用繼承式雙目標基因演算法結合病人存活(Disease-free survival)資訊,找尋肺癌病人轉移相關之基因進行分析。並針對不同權重的適應函數建立實驗模型,且利用不同特徵選取的方式找出最佳化的一組基因。 最終我們選擇以 0.8*accuracy+0.2*disease-free area 為權重之適應函數,以循序向後選擇法,在無病存活面積及正確率最高的情況下,獲得一組 11 個基因的特徵集合。此組 EMT 相關肺癌轉移因子是藉由結合病人的無病存活面積及透過 IBCGA 做特徵基因選取。另外,經由相關文獻來證實此組基因與癌症研究均有其關聯性。 我們的研究是第一個將微陣列資訊結合無病存活面積,配合最佳化演算法來 找出與 EMT 相關之因子,此研究結果可提供後續做實驗的一組候選基因及在未來可能幫助癌症的治療。
The central difficulties in microarray classification are the availability of a very small number of samples in comparison with the number of genes in the sample, and the experimental variation in measured gene expression levels. However, growing evidence suggests that gene-based prediction is not stable and little is known about the prediction power of the gene expression profile compared to well-known clinical and pathologic predictors. To reduce thousands of gene pool to minimum redundancy feature, several gene selection methods are essential to improve the predictive accuracy and to identify potential marker genes for a disease. We proposed a novel method for identifying a set of EMT-related genes of distant metastasis. In this study, we adopt the t-statistics for the initial feature selection task and use Support Vector Machine (SVM) classifier. In fact, we have combined gene expression and clinical outcome to fitness function. In addition to gene expression values, the proposed method uses disease-free survival which is reliable and useful information on genes. Furthermore, giving the different weights of accuracy and disease-free area to fitness function, criteria with large weights have more influence on the fitness obviously. The proposed method has been applied to lung cancer dataset. The results show that our method has improved classification performance in terms of accuracy on independent test. The use of clinical outcome can compensate, in part, for the limitations of microarrays, such as having a small number of samples. The proposed method addresses the weakness of conventional methods by utilizing disease-free survival information. It predicts marker genes for distant metastasis of lung cancer with a high accuracy. Out of 474 possible genes related to EMT, identified 11 candidate genes, of which about half have been experimentally verified in the literature. The predictions made in this study can serve as a list of candidates for wet-lab verification and might help in the search for a cure for cancers.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079929518
http://hdl.handle.net/11536/49984
Appears in Collections:Thesis