標題: 應用主成分分析於老鼠胚胎幹細胞免疫沈澱定序資料揭示轉錄因子與組蛋白修飾標記在特定區域的顯著關係
A Domain Constrained Principal Component Analysis of ChIP-Seq Data Reveals Significant Association between Transcription Factors and Histone Marks in Mouse Embryonic Stem Cells
作者: 曾晴雯
Tseng, Chin-Wen
洪瑞鴻
Hung, Jui-Hung
生物資訊及系統生物研究所
關鍵字: 增強子;全基因;表徵遺傳學;轉錄因子;主成分分析;epigenetics;enhancer;Principal Component Analysis
公開日期: 2015
摘要: 基於對全基因組之轉錄因子結合,核小體佔據,以及表徵遺傳學標記的整合分析是一個可以用來分析與理解大規模且複雜調控機制的強大分析。表徵遺傳學中複雜的調控機制揭示了在增強子,轉座子,偽基因,以及長鏈非轉錄核糖核酸上的選擇性結合與染色質狀態的開關的情況,同時也解釋了在生物學上的過程與機制。 我們設計了一套可以在高維度下,有效率地揭示檔案中具有相同生物意義特性元素的計算架構。我們使用了這套架構分析從ENCODE下載的染色質免疫沉澱原始資料後,與增強子有關的轉錄元件被分在同一群,驗證了我們的架構。並且,藉由觀察這些群內的因子,我們發現了Ash2l蛋白與Oct4蛋白同時出現的時候的增強子特性,並且在更深入的分析以及生物實驗過後,我們驗證了當Ash2l與Oct4在對幹性Nanog基因的增強子區域上同時出現時的確會增強會Nanog基因的表現。 由於我們的架構簡潔且只需要輸入有興趣的狀況與ENCODE資料庫的染色質免疫沉澱原始資料,即可將在有興趣狀況下各個轉錄元件的特性分群歸類,我們因此建議在尋找轉錄元件的合作關係時採用我們的架構做分析。
The meta-analysis based on genome-wide profiles of transcription factor binding, nucleosome occupancy, and epigenetic marks is one of the powerful approaches to understand the complex regulatory circuitry in a large scale. Sophisticated regulation in the level of epigenetics reveals selective binding or chromatin state switching located on enhancers, transposons, pseudogenes, lncRNAs, and so on, which in part explains the machinery of underlying biological processes. We developed a computational framework that efficiently reveals profiles that show similar characteristics on a high-dimensional domain with biological knowledge constraints. We applied this framework to investigate ChIP-seq data from ENCODE and successfully clustered enhance related factors. By looking into those clusters, we found Ash2l and Oct4 co-localize on some enhancer regions and further by applying in-depth analysis and wet-lab experiments, we verified that Ash2l and Oct4 co-localize on an enhancer region distant to a stemness gene Nanog and their binding indeed enhances the expression of Nanog.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT070157222
http://hdl.handle.net/11536/125692
顯示於類別:畢業論文