標題: 基於去氧核醣核酸與組蛋白之修飾特徵識別轉錄因子之功能
Systematic Identification of novel functional roles of transcription factors based on DNA features and histone codes
作者: 戴璟淳
洪瑞鴻
Tai, Ching-Chun
Hung, Jui-Hung
生物資訊及系統生物研究所
關鍵字: 轉錄因子;增強子;組蛋白修飾;去氧核醣核酸;Transcription factor;Enhancer;Histone modification;DNA
公開日期: 2016
摘要: 高通量定序(High-throughput sequencing)技術的應用已經超出去氧核醣核酸(DNA)和核醣核酸(RNA)的定序範疇,並漸漸地進入表觀遺傳學的世界裡。去氧核醣核酸甲基化、去氧核醣核酸自由區域和轉錄因子的結合點位都可以以目前分辨全基因組的分析技術中揭開序幕。去氧核醣核酸酶定序(DNase-seq)、染色質免疫沈澱定序(ChIP-seq)以及亞硫酸鹽定序(Bisulfite-seq)是現在常規用來研究探討表觀遺傳特徵的實驗,這些實驗方法更讓我們有機會整理出:去氧核醣核酸、蛋白質修飾之間的交互調控關係。這些表觀遺傳特徵及標誌都是在講述基因組編碼中如何建立調控模塊。去氧核醣核酸元素百科(Encyclopedia of DNA Elements, ENCODE)是一個國際合作計畫,從數以百計的染色質免疫沈澱定序資料中,策劃並解碼這些被稱作組蛋白代碼的標誌。然而這些轉錄因子如何與組蛋白代碼交互作用,與之間的功能作用機制仍有很多尚未被探索,至今仍然缺乏一種分析方法,來提取ENCODE資料庫或其他數據資料庫的全基因組實驗中有用訊息來了解表觀基因組。在本篇碩論當中,我們提出了系統性框架,提取有興趣的資料特徵和分群,發現轉錄因子及組蛋白標記之間的交互關係:成千上萬的組蛋白標記被量化並轉換成較低維度的資訊,以利提取每個標記類型中最為顯著的特性;轉錄因子會顯示出與某些類型的組蛋白標記優先做結合,接著與最接近的標記去推斷其功能及進行基因表達量分析。我們實際收集ChIP-seq定序資料並應用這個框架,找出三胸家族集團(Trithorax group)蛋白中一個意想不到的結果,這體現出我們提出此框架的效用和重要性。
The application of high-throughput sequencing has gone beyond DNA and RNA sequencing and march into epigenetics. DNA methylation, DNA-free region, and transcription factor (TF) binding sites are now unveiled in genome-wide scale and in base resolution. DNase-seq, ChIP-seq, and BS-seq, which endow us an unprecedented opportunity to tease out the regulatory interplay between DNA, protein, and modification, are now regular experiments to make these epigenetic traits accessible. These epigenetic traits or marks are indicative signs to the building block of regulatory modules encoded inside genome. ENCODE (Encyclopedia of DNA Elements) project is an ambitious ongoing international cooperation to curate and decipher these signs, which is also known as histone code, from hundreds of TF/histone modification ChIP-seq data. However, how does those TFs interact with histone codes to reflex it's functional role is still left largely unexplored. An analytical approach to extract useful information from ENCODE's data warehouse or other genomes wide experimental collections to facilitate the understanding of epigenome is still missing. In this work, we have proposed a feature extraction and clustering framework to automate the discovery of putative interactions between TF and histone marks. Tens of thousands of histone marks are quantified and transformed into a lower dimensional space for extracting the most significant characteristics of each type of marks. TFs that show preferential bindings on certain types of histone marks are clustered and analyzed along with the expression levels of genes that are closest to the marks to further deduce their functions. We further applied this framework to real ENCODE datasets and the results suggest an unexpected role of a well-known Trithorax Group (TrxG) protein, which exemplifies the usefulness and importance of our proposed framework.
URI: http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070357204
http://hdl.handle.net/11536/139828
顯示於類別:畢業論文