生成基因表現量晶片資料方法之研究

Full metadata record

DC Field	Value	Language
dc.contributor.author	黃冠華	en_US
dc.contributor.author	Huang Guan-Hua	en_US
dc.date.accessioned	2014-12-13T10:41:43Z	-
dc.date.available	2014-12-13T10:41:43Z	-
dc.date.issued	2012	en_US
dc.identifier.govdoc	NSC101-2118-M009-004-MY2	zh_TW
dc.identifier.uri	http://hdl.handle.net/11536/98666	-
dc.identifier.uri	https://www.grb.gov.tw/search/planDetail?id=2590004&docId=391241	en_US
dc.description.abstract	微陣列晶片已經成為一種廣泛被應用的基因技術，許多分析方法也應運而生。我們嘗試建立經驗模型去模擬每個基因的基因表現量，這些模擬的基因表現量可用於評估各種分析方法。為了達到基因組織的多樣性，我們蒐集在Gene Expression Omnibus與ArrayExpress這兩資料庫儲存的基因原始表現資料，我們著重的平臺是艾菲爾(Affymetrix)公司所製造的HG-U133A基因晶片。將這些資料經過預處理後，可得到22283個基因表現量的經驗分配模型。我們運用這22283個分配去模擬基因表現量。在此計畫我們將提供模擬方法的步驟，並嘗試模擬了多組不同片數的嵌釘(spike-in)資料，觀察基因表現量模擬值和原始值的差異。本計畫亦將透過OpenMP與MPI平行運算，使得程式在執行大量基因晶片預處理計算的時間縮短，並且在高效能個人電腦工作站、國家高速電腦中心與Amazon EC2雲端運算三種不同電腦環境上運作，觀測他們的平行效率。由此得到的結果與經驗，將有可用於未來執行高維度基因資料分析之所需。本計畫原定自上一年度分三年期執行，但僅獲通執行一年期計畫。我們現已完成運用高效能平行運算，來執行大量基因晶片預處理計算，及其效率的評估。本年度將接續其餘未完成部分，加以執行。	zh_TW
dc.description.abstract	Microarray gene expression analysis has become one of the most widely used functional genomics tools. Since that, many analytical methods have been proposed. It is desirable to develop realistic models that can be applied in simulating expression values of each gene, and can then be used to assess the analysis methods and testing approaches. In this project, we plan to download publicly available raw data of the Affymetrix HG-U133A platform for various tissues from two public repositories: Gene Expression Omnibus and ArrayExpress. Then, an empirical approach is developed to determine the distribution of expression intensity for each gene, which can be used to simulate realistic gene expression data. The proposed method has several unique features that resolve the shortage of previous research. To evaluate the proposed simulating approach, we will examine the distributions of housekeeping genes, compare the simulated and real gene expression data, and simulate gene expression intensities, which mimic the expression patterns shown in the HG-U133A tag spike-in dataset, to determine the sensitivity and specificity of various differential expression detecting methods. This project also attempts to use OpenMP and MPI parallel computing to reduce computing time when reprocessing the large amount of downloaded microarray raw data. We will compare the parallel efficiency of OpenMP and MPI in the high efficient personal workstation, the National Center for High-performance Computing and the Amazon EC2 cloud computing environment. The results and experiences gained from this experiment can be applied to future high-dimensional genomic data computation. This study was proposed for a three-year project on year 2011, but we only obtained funding for one year. Our team has finished implementing high efficient parallel computing for reprocessing the large amount of downloaded microarray raw data. We are now planning to continue the un-done part of the study in the coming two years.	en_US
dc.description.sponsorship	行政院國家科學委員會	zh_TW
dc.language.iso	zh_TW	en_US
dc.subject	艾菲爾基因晶片	zh_TW
dc.subject	雲端運算	zh_TW
dc.subject	基因表現量微陣列晶片	zh_TW
dc.subject	微陣列資料庫	zh_TW
dc.subject	平行運算	zh_TW
dc.subject	模擬	zh_TW
dc.subject	Affymetrix GeneChip	en_US
dc.subject	cloud computing	en_US
dc.subject	gene expression microarray	en_US
dc.subject	microarray data archive	en_US
dc.subject	parallel computing	en_US
dc.subject	simulation	en_US
dc.title	生成基因表現量晶片資料方法之研究	zh_TW
dc.title	A Study on Simulating Realistic Gene Expression Microarray Data	en_US
dc.type	Plan	en_US
dc.contributor.department	國立交通大學統計學研究所	zh_TW
Appears in Collections:	Research Plans