標題: RNA結構的功能設定
Function assignment of RNA structures
作者: 陳昆澤
盧錦隆
生物資訊及系統生物研究所
關鍵字: 演算法;生物資訊;RNA三級結構;結構比對;RNA功能設定;Algorithm;Bioinformatics;RNA tertiary structure;Structural alignment;RNA function assignment
公開日期: 2009
摘要: 近年來,生物學家對ncRNA這種不會轉譯成蛋白質的RNA分子是愈來愈有興趣了,因為他們在細胞內扮演著許多重要的角色,包括基因的調控、RNA的修改、與染色體的複製等等。一般而言,RNA分子的結構在演化上通常比其序列還來得保守,因此分析RNA分子的結構相似程度將有助於生物學家對其功能的了解。然而比較兩個RNA三級結構的相似度是一件困難的工作,因為它已被證明是NP-hard的問題。先前我們的實驗室已經利用一個啟發式的方法發展了一名為iPARTS的結構比對工具可以讓生物學家快速且準確地比較出兩個RNA三級結構的相似程度。這個啟發式方法的主要精神如下:首先我們利用RNA核□酸的兩個假扭轉角□及□畫出一張類似Ramachandran的平面圖表。接著我們利用所謂的親合性互動(Affinity Propagation)分群演算法來對在□-□平面圖上的RNA核□酸進行分群並得到一組含有23個核□酸結構的字元集。最後我們用這個結構字元集將兩個輸入的RNA三級結構編碼成兩條由結構字元所組成的一級序列,然後再利用傳統的序列比對演算法去比對這兩條結構字元編碼的一級序列以決定出原RNA三級結構之間的相似程度。在這本論文中,我們進一步地在iPARTS身上添加一個新的功能來幫助生物學家準確地找出一個RNA三級結構的功能。為了這個目的,我們首先利用上述的結構字元集將要查詢的RNA三級結構與一個事先準備好的資料庫中所有已知功能的RNA三級結構編碼成一級的結構字元序列,然後再利用iPARTS去比較出要查詢的RNA三級結構與資料庫中每一個已知功能的RNA三級結構的整體相似程度,最後根據結構最相似的RNA來設定要查詢RNA三級結構的功能。最後的實驗結果顯示出我們的iPARTS在設定RNA三級結構的功能上確實比另外一個類似的工具SARA來得優秀,其中SARA是利用所謂的單位向量來比較出兩個RNA三級結構的相似程度。
In recent years, there is a fast growing interest in noncoding RNAs (ncRNAs) whose transcripts are not translated into proteins, because they play essential roles in many cellular processes, such as gene regulation, RNA modification and chromosome replication. Typically, structures of RNA molecules are more evolutionarily conserved than their sequences and, therefore, the analysis of the RNAs on the structure level can be helpful for biologists to understand their functions. However, detecting structural similarities in two RNA molecules at tertiary structure level is a difficult job, because it has been shown to be an NP-hard problem. Previously, our laboratory have used a heuristic approach to develop a useful tool, called iPARTS, which allows biologists to fast and accurately compare the structural similarity of two RNA tertiary structures. The basic idea of this heuristic approach is as follows. First, we derived a Ramachandran-like diagram of RNAs by plotting the pseudo-torsion angles □ and □ of their nucleotides on a two-dimensional (2D) axis. Next, we applied the so-called affinity propagation clustering algorithm to this □-□ plot to obtain a structural alphabet (SA) of 23 nucleotide conformations. Finally, we used this SA to encode RNA three-dimensional (3D) structures into one-dimensional (1D) sequences of SA letters and then applied traditional algorithms of sequence alignments to these 1D SA-encoded sequences for determining the structural similarities between two given RNA 3D structures. In this study, we have further equipped our iPARTS with a new function that is able to help biologists to accurately find the function of a given RNA 3D structure. For this purpose, we first utilize the above SA to encode the query and all the RNA 3D structures with known function in a pre-prepared database into 1D SA-encoded sequences, then use iPARTS to compare the globally structural similarity between the query RNA and each of the RNAs with known functions in the database, and finally assign the annotated function of the most structurally similar RNA to the query RNA 3D structure. Consequently, our experimental results have shown that our iPARTS indeed is superior to a similar tool, named SARA that uses the so-called unit-vector approach to align two RNA 3D structures, when assigning the functions of the RNA 3D structures.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079751502
http://hdl.handle.net/11536/45810
顯示於類別:畢業論文


文件中的檔案:

  1. 150201.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。