標題: 使用效度與信度來比較艾菲爾微陣列基因晶片的預處理方法與表現量差異方法的組合
Validity and Reliability of Combinations of Preprocessing and Differential Expression Methods for Affymetrix GeneChip Microarrays
作者: 王雅莉
Ya-Li Wang
Guan-Hua Huang
關鍵字: 微陣列晶片;艾菲爾基因晶片;接收器運作指標曲線;Microarray;Affymetrix GeneChip;ROC curve
公開日期: 2006
摘要: 微陣列晶片的技術已被廣泛地應用了好幾年,許多計算分析的工具也已被發展出來,而我們著重的平臺是已被廣泛應用的艾菲爾(Affymetrix)公司所製造的基因晶片。為了評估各種預處理方法與表現量差異方法組合的表現,我們考慮了四種常用的預處理方法:MAS 5.0、RMA、dChip及PDNN,與五種常用的表現量差異方法:fold-change、two sample t-test、SAM、EBarrays及limma。為了評估各種方法組合的效度,我們使用了三組嵌釘(spike-in)資料以及接收器運作指標曲線來做評估;而為了評估信度,我們採用另一組來自「微陣列晶片品質管制計畫」的資料組,此資料是將樣本分送至兩個同樣使用艾菲爾晶片平台的不同檢測站所生成的資料,用此兩檢測站的資料所選出的表現量差異基因的重複率作為比較信度的準則。若同時注重信度與效度,我們推薦幾種方法組合:當表現量差異基因個數少時,推薦RMA+fold-change、RMA+SAM、RMA+limma、PDNN+ fold-change、PDNN+SAM與PDNN+limma此六種組合;而當表現量差異基因個數多時,則推薦dChip(PM-only)+fold-change、dChip(PM-only)+SAM與dChip(PM-only)+limma此三種組合。
Microarray technology has been widely used for several years and a large number of computational analysis tools have been developed. We focus on the most popular platform, Affymetrix GeneChip arrays. To evaluate which combinations of preprocessing and differential expression method perform well, we consider 4 popular preprocessing methods (MAS 5.0, RMA ,dChip and PDNN) and 5 popular differential expression methods (fold-change, two sample t-test, SAM, EBarrays and limma). We use three spike-in datasets to assess the validity, and ROC curves are used for the evaluation. To evaluate the reliability, we use another dataset from MAQC project, which was generated using samples hybridized to Affymetrix platform at two different test sites. Overlap rates between two test sites are compared. To give consideration to both validity and reliability, six combinations are recommended when differentially expressed genes are less, RMA+fold-change, RMA+ SAM, RMA+ limma, PDNN+ fold-change, PDNN+SAM, and PDNN+limma. Three combinations are recommended when differentially expressed genes are more, dChip(PM-only)+ fold-change, dChip(PM-only)+SAM, and dChip(PM-only)+limma.


