標題: 以迭代式偏最小平方法尋找糖腎病代謝物生物標記
Iterative Partial Least Squares for Discovering Biomarkers of Diabetic Nephropathy
作者: 邱仕翔
何信瑩
Chiu, Shih-Hsiang
Ho, Shinn-Ying
生物資訊及系統生物研究所
關鍵字: 代謝體學;糖腎病;生物標記;偏最小平方法;Metabolomics;Diabetic Nephropathy;Biomarker;Partial Least Squares
公開日期: 2016
摘要: 背景:糖腎病為糖尿病所引起之腎臟病,在已開發國家中有逐年增加的趨勢,造成國家以及自身經濟上龐大的負擔。糖腎病在早期階段進行治療能較有效的減緩病情惡化及改善病情,但由於糖腎病在早期並沒有明顯的臨床症狀,使醫療機構目前難以早期診斷糖腎病,代謝體學能藉由分析生物體中代謝物的差異,得知生物體中異常之代謝物並找出病因,本研究目的為分析不同分期的糖腎病患之代謝物數據,並找出糖腎病可能的生物標記。 方法:本研究收集54名不同階段的糖腎病患者的尿液樣本(DM、DN1、DN2,每組18人),經由LC-ESI-TOF-MS或得代謝物資訊後,以新提出的迭代式偏最小平方法分析各組之間(兩兩分類模型以及三組回歸模型)的代謝物差異,最後將各組結果取交集以找出在各組間均有顯著差異的代謝物。 結果:在DM和DN1的分類中選出44個代謝物做為分類模型,並達到AUC為0.8858(10-fold cv);DM和DN2的模型選出124個代謝物,其AUC為0.9985;DN1和DN2的分類模型選出30個代謝物並達到AUC = 0.9552,另以三種糖腎病分期所建立的回歸模型選出140個代謝物,其方均根誤差為1.4836,所有選出的代謝物交集後,有15個代謝物在各個模型中均有挑選,其中有10個代謝物隨著分期階段遞增或遞減。 結論:研究結果顯示,本研究所提出的方法能較有效的選出一組數量較少且預測效能較好的代謝物,且找出的代謝物與糖腎病的嚴重程度相關。此方法以及所找出的代謝物須以數量更多的研究樣本做驗證。
Background: Diabetic nephropathy (DN) is a kidney disease caused by long term diabetes mellitus (DM). In the developed countries, the number of patients with DN has increased annually, caused a huge financial burden to patients itself and countries. The progression of DN can be slowed down or well controlled once it is treated at early stage. However, there is no significant clinical symptoms in the early stage of DN, which thus makes it difficult for early detection of DN. Metabolomics can reveal the causes of the disease by analyzing the differences of metabolite profile between patients and healthy individuals. This study aims to find the potential biomarkers of diabetic nephropathy by analyzing different stage of DN patients’ metabolite profile. Methods: We collected 54 urine samples from different stage of DN patients (DM, DN1, DN2, 18 patients per group). After extracting metabolite profile by LC-ESI-TOF-MS, this study proposed an iterative PLS method to analyze the different metabolite between these groups. Then combined the results observed from different comparison to find metabolites that showed significant difference between three groups. Results: In DM vs. DN1 classification model, there were 44 metabolites used for differentiating DM and DN1 patients, with 10-fold CV AUC = 0.8858. The model chose 124 metabolites for the classification of DM and DN2 and yield AUC = 0.9985. As for the classification of DN1 and DN2, the model used 30 metabolites and achieved AUC = 0.9552. In the regression model, there were 140 metabolites used to predict patients UACR, with RMSE = 1.4836. The intersection of metabolites observing from different model showed 15 metabolites that has been chosen among all models. 10 of 15 metabolites showed increasing or decreasing trend as the severity of DN increase. Conclusions: Method proposed in this study can have better performance with less variables compared to regular VIP>1 method. The metabolites discovered in this study showed relations with severity of DN, but the relations between DN and these metabolites are still unclear. This method and the metabolites should be confirmed in further research with a larger cohort.
URI: http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070357218
http://hdl.handle.net/11536/140027
顯示於類別:畢業論文