Full metadata record
DC FieldValueLanguage
dc.contributor.author蕭瓊柏zh_TW
dc.contributor.author洪瑞鴻zh_TW
dc.contributor.authorHsiao, Chiung-Poen_US
dc.contributor.authorHung, Jui-Hungen_US
dc.date.accessioned2018-01-24T07:38:03Z-
dc.date.available2018-01-24T07:38:03Z-
dc.date.issued2016en_US
dc.identifier.urihttp://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070357201en_US
dc.identifier.urihttp://hdl.handle.net/11536/139477-
dc.description.abstract核糖核酸結合蛋白(RNA-binding Proteins, RBPs)在生物體內扮演重要的角色,核糖核酸(RNAs)轉錄後修飾(Post-transcriptional regulation)的行為,都需要核糖核酸結合蛋白的協助。近年來發展了CLIP-Seq(Cross-linking immunoprecipitation high-throughput sequencing)的實驗技術,來協助研究核糖核酸結合蛋白與核糖核酸的關係。CLIP-Seq是使用紫外光照射細胞,加強核糖核酸與核糖核酸結合蛋白的交聯(cross-linking),再利用免疫沈澱法(Immunoprecipitation, IP)抓取核糖核酸結合蛋白,最後萃取核糖核酸結合蛋白上的核糖核酸進行高通量定序。當抓取核糖核酸結合蛋白為Agonaute(AGO)時,由於AGO會與小分子核糖核酸(microRNAs, miRNAs)形成核糖核酸誘導沈默複合體(RNA-induced silencing complex, RISC),我們不僅萃取到核糖核酸的序列,也得到了許多小分子核糖核酸(microRNAs, miRNAs)。現今出現了許多種CLIP-Seq實驗:有HIT-CLIP、PAR-CLIP、iCLIP。目前缺乏一個泛用的分析框架,提供尋找核糖核酸結合蛋白與核糖核酸的結合點位的功能,也支援小分子核糖核酸與核糖核酸結合關係的預測,且支援現存各種類的CLIP-Seq技術。 此篇論文,我們提出一個核心為K-means-pHMM的CLIP分析流程,具有高度泛用的特性,能分析HIT-CLIP、PAR-CLIP、iCLIP這三種CLIP次世代定序資料。我們進行模擬測試證明了我們的非監督式機器學習演算法的數學收斂性相當迅速,最後也收集了多筆NCBI CLIP-Seq資料,重新分析並觀察到符合過去研究的分子生物現象。zh_TW
dc.description.abstractRNAs are regulated by RNA-binding proteins (RBPs) that bind to the single- or double- stranded RNAs in cells. RBPs bind RNAs and function as ribonucleoprotein complexes and involve in splicing (e.g., U1 snRNP), RNA editing (e.g., ADAR), polyadenylation (e.g., CPSF), mRNA localization (e.g., ZBP1), post-transcriptional regulation (e.g., miRNA-RISC), etc. To understand the relationship between the RBPs and RNAs, the cross-linking immunoprecipitaion followed by next generation sequencing (CLIP-Seq) method is developed. There are currently three major variants of CLIP-Seq based methods, HIT-CLIP, PAR-CLIP, iCLIP. Many algorithms have been proposed to define the binding sites, nevertheless, these methods can be applied to just one or a few CLIP-Seq variants and the results are hard to integrate and compare. In this work, we propose a universal algorithm, GLIP, can be applied to all three CLIP-Seq variants with powerful performance and efficiency.en_US
dc.language.isozh_TWen_US
dc.subject高通量定序zh_TW
dc.subjectCLIP-Seqzh_TW
dc.subject核糖核酸結合蛋白zh_TW
dc.subject小分子核糖核酸zh_TW
dc.subject非監督式機器學習zh_TW
dc.subjectProfile 隱藏馬可夫模型zh_TW
dc.subjectNGSen_US
dc.subjectCLIP-Seqen_US
dc.subjectHIT-CLIPen_US
dc.subjectPAR-CLIPen_US
dc.subjectiCLIPen_US
dc.subjectRNA-binding proteinen_US
dc.subjectProfile HMMen_US
dc.subjectMachine learningen_US
dc.title開發基於K-means-pHMM 機器學習演算法之交聯免疫沈澱法高通量定序的泛用分析框架zh_TW
dc.titleA General CLIP-Seq data analysis framework based on a K-means-pHMM learning and clustering algorithmen_US
dc.typeThesisen_US
dc.contributor.department生物資訊及系統生物研究所zh_TW
Appears in Collections:Thesis