標題: RNA三級結構的比對與資料庫的搜尋
Alignment and Database Search of RNA Tertiary Structures
作者: 楊忠翰
盧錦隆
林苕吟
Yang, Chung-Hang
Lu, Chin-Lung
Lin, Tiao-Yin
生物資訊及系統生物研究所
關鍵字: 核糖核酸;三級結構;結構比對;結構字元集;RNA;3D structure;Structural alignment;Structural alphabet
公開日期: 2016
摘要: 近年來,生物學家們對於RNA越來越感興趣了,這是因為RNA不但能轉譯成蛋白質,它們更在細胞內扮演著許多重要的角色,包括基因的調控、RNA 的修飾與染色體的複製等等。 但是目前許多RNA的功能卻仍是未知的,而如同在蛋白質上的研究一樣,一個較為可靠的方法用來分析RNA的功能就是剖析它們的三級結構。這是因為RNA 分子的結構在演化上通常比其序列還來得保守。然而比較兩個RNA 三級結構的相似度是一件困難的工作,因為它已被證明是NP-hard的問題。之前我們實驗室已經利用一個啟發式的方法發展了一個有用的RNA結構比對的工具名為iPARTS。這個方法是一種結構字元的方法,我們首先利用一個包含23個結構字元的字元集將RNA三級結構編碼為由結構字元所組成的一級的序列,之後我們在應用傳統的序列比對演算法來比對這些編碼後的一級序列,藉此我們即可決定兩個RNA三級結構間的相似程度。基於以上所描述的結構字元方法,在這次研究中我們首先發展了一個名為R3D-BLAST的資料庫搜尋工具讓生物學家去搜尋PDB資料庫裡與特定RNA三級結構相似的RNA。R3D-BLAST基本想法如下:首先我們將PDB資料庫中所有RNA三級結構利用一個含有23個結構字元的字元集編碼成一級的序列,之後我們再利用BLAST這個程式去搜尋出與query RNA在三級結構有局部相似的RNA結構。實驗結果也證實R3D-BLAST的確能快速且正確地在PDB資料庫中搜尋出RNA其結構與query RNA擁有相似的子結構。其次我們提出第二版的iPARTS,簡稱iPARTS2。iPARTS2利用了一個包含92個元素的結構字元集將RNA三級結構編碼為一級的結構字元序列。這個結構字元集與iPARTS的結構字元集最不同的地方在於前者的每個元素都含有三級結構以及一級序列的資訊,而後者的每個字元只有攜帶三級結構的資訊。實驗結果也證實iPARTS2在RNA結構比對的品質與功能預測的表現上不但優於iPARTS,也勝過一些主流的軟體像是SARA、SETTER跟RASS。使用者可以在http://genome.cs.nthu.edu.tw/R3D-BLAST/以及http://genome.cs.nthu.edu.tw/iPARTS2/來各別使用R3D-BLAST與iPARTS2。
In recent years, there is a fast growing interest in RNAs, because they not only transfer genetic information from DNA to proteins but also play essential roles in many cellular processes, such as gene regulation, RNA modification and chromosome replication. Actually, the func-tions of most available RNAs are still unknown. Likewise to proteins, a more reliable way for determining the functions of RNAs is to ana-lyze their tertiary structures, because structures of molecules are typ-ically more evolutionarily conserved than their primary sequences. However, detecting structural similarities in two RNA molecules at tertiary structure level is a difficult job, because it has been shown to be an NP-hard problem. Previously, our laboratory have used a heu-ristic approach to develop a useful tool, called iPARTS, which allows biologists to fast and accurately compare the structural similarity of two RNA tertiary structures. It was implemented by a structural al-phabet (SA)-based approach, which uses an SA of 23 letters to reduce RNA 3D structures into 1D sequences of SA letters and applies tra-ditional sequence alignment to these SA-encoded sequences for de-termining their global or local similarity. In this study, we first have further developed a BLAST-like search tool, called R3D-BLAST, based on the structural alphabet-based approach. R3D-BLAST allows the user to quickly and accurately search against the PDB for RNA structures sharing similar substructures with a specified query RNA structure. The basic idea behind R3D-BLAST is that all the RNA 3D structures deposited in the PDB are first encoded as 1D structural sequences using a structural alphabet of 23 distinct nucleotide con-formations, and BLAST is then applied to these 1D structural se-quences to search for those RNA substructures whose 1D structural sequences are similar to that of the query RNA substructure. The ex-perimental results have shown that our R3D-BLAST can quickly and accurately search the PDB for RNAs that share similar 3D sub-structures with a query RNA. Second, we have re-implemented iPARTS into a new web server iPARTS2 by constructing a totally new SA, which consists of 92 elements with each carrying both information of base and backbone geometry for a representative nucleotide. This SA is significantly different from the one used in iPARTS, because the latter consists of only 23 elements with each carrying only the back-bone geometry information of a representative nucleotide. Our ex-perimental results have shown that iPARTS2 outperforms its previous version iPARTS and also achieves better accuracy than other popular tools, such as SARA, SETTER and RASS, in RNA alignment quality and function prediction. R3D-BLAST and iPARTS2 are now available online at http://genome.cs.nthu.edu.tw/R3D-BLAST/ and http://genome.cs.nthu.edu.tw/iPARTS2/, respectively.
URI: http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT079851807
http://hdl.handle.net/11536/139042
Appears in Collections:Thesis