標題: 辨識及分析估測癌症病患存活時間的微核醣核酸特徵集
Identifying and analyzing the miRNA signatures of estimating survival time in patients with various cancers
作者: 葉書霖
何信瑩
Yerukala, Sathipati Srinivasulu
Ho, Shinn-Ying
生物資訊及系統生物研究所
關鍵字: NA;NA;NA;NA;NA;NA;NA;MiRNA expression;Support vector regression;Survival estimation;Ovarian cancer;Lung adenocarcinoma;Glioblastoma multiforme
公開日期: 2017
摘要: NA
In recent years, success of genome sequencing produced massive amounts of nucleotide sequence data. These data inspire the computational world to develop more effective and precise analysis methods, so that they may be useful for biological discovery. Bioinformatics is a science, employs standardization and analyzing of biological data using computational techniques. For example, there are thousands of dimensions in microarray data, analyzing these data can help biologist to better understand different pathways and processes involving in the living cell. Machine learning is among the important computational technique to deal with these data. Support vector machine (SVM) is a class of machine learning method, which has two segments, one is support vector classification (SVC) and support vector regression (SVR). Now a days, SVMs are extensively using in solving the biological problems. This dissertation addresses the development of novel methods for identifying the miRNA signatures and predicting the survival time in cancer patients using microRNA (miRNA) expression profiles. The novel OSVR-GBM, OSVR-LUAD and OSVR-OVC methods were developed based on support vector regression (SVR), incorporated with optimal feature selection algorithm IBCGA to estimate the survival time of patients with glioblastoma multiforme (GBM), lung adenocarcinoma (LUAD) and ovarian cancer (OVC) respectively. OSVR-GBM, OSVR-LUAD and OSVR-OVC help to identify miRNA signature from the large number of miRNA expression profiles. OSVR-GBM was used to estimate the survival time of patients with GBM. OSVR-GBM achieved a correlation coefficient of 0.76 and mean absolute error of 0.63 year between real and estimated survival time. We ranked top 10 miRNAs based on main effect difference (MED) analysis. Identified miRNAs were proven to be significant in GBM survival. OSVR-LUAD identified 18 out of 332 miRNAs that were associated with the survival time in patients with lung adenocarcinoma. OSVR-LUAD achieved a correlation coefficient of 0.88 ± 0.01 and mean absolute error of 0.56 ± 0.03 year between real and estimated survival time. OSVR-LUAD performs well compared to some well-recognized regression methods. Another survival time estimator OSVR-OVC identified 26 miRNAs (from 234 expression profiles) that were significantly associated with the survival of ovarian cancer patients. Prediction performance after 10-fold cross-validation evaluation was as follows: mean and standard deviation of correlation coefficient of 0.71±0.02 and mean absolute error of 1.04 ± 0.09 years. Furthermore, we identified 10 most relevant miRNAs among the 26 miRNAs according to their contribution to the survival prediction. The biological significance of these identified miRNAs were analyzed. Finally, this dissertation explores the applying support vector regressions in solving large parameter optimization problems. This is associated with the optimized SVR methods OSVR-GBM, OSVR-LUAD and OSVR-OVC to estimate the survival time of patients with different cancer types. The proposed optimized SVR-methods could identify the miRNA signatures associated with survival time of patients with glioblastoma, lung adenocarcinoma and ovarian cancers. The results have demonstrated that the proposed method can increase the efficiency of survival estimation in these cancer types. Our approach to identification of microRNA signatures is expected to facilitate early-stage detection of glioblastoma, lung adenocarcinoma and ovarian cancer, and to improve the treatment conditions.
URI: http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070187204
http://hdl.handle.net/11536/142191
顯示於類別:畢業論文