標題: 利用剪下-圓形化-線性化-插入的運算排序基因體
Sorting Genomes Using Cut-Circularize-Linearize-and-Paste Operations
作者: 黃亙亘
邱顯泰
盧錦隆
Chiu, Hsien-Tai
Lu, Chin Lung
生物資訊及系統生物研究所
關鍵字: 生物資訊;基因體重組;演算法;Bioinformatics;Genome Rearrangement;Algorithm
公開日期: 2010
摘要: 在演化的過程中,基因體可能會經歷一些被稱為基因體重組的大規模突變。隨著愈來愈多的基因體被完整地定序,基於基因次序分析的基因體重組研究在演化樹的重建上扮演著重要的角色。在過去,有許多傳統的重組運算已被提出以評估兩個相關基因體基因次序的演化距離,例如反轉、移位、區塊互換、易位、分裂與融合。通常基因體重組的計算研究被定義成一個利用重組運算來排序一個排列的問題。在這個論文中,我們介紹一個利用剪下-圓形化-線性化-插入 (簡稱CCLP)的運算來進行排序的問題,目的是要找出最少次數的CCLP運算來排序一個表示一條染色體的有號整數排列。CCLP運算是一種基因體重組的運算,它會把染色體的某一個片段剪下,然後再把剪下的片段圓形化成一個暫時性的環狀染色體,接著再把這個環狀染色體線性化為一條線性染色體,最後再把線性化的染色體貼回到原來的染色體中,但在線性染色體被貼回去之前允許它被反轉。CCLP運算可以模擬一些上述為人所熟知的重組,例如反轉、移位與區塊互換,以及其他未在生物文獻中被報導的重組。為了與反轉作區別,我們把其它的CCLP運算稱為非反轉的CCLP運算。最後,當反轉與非反轉CCLP運算的權重比為1:2時,我們提出一個時間複雜度為 O(δn) 的演算法來解決有加權重的CCLP運算排序問題,其中 n 為一條染色體內的基因個數,而 δ 為排序過程中所需的CCLP運算次數。
During evolution, genomes may undergo large-scale mutations called as genome rearrangements. With more and more genomes have been sequenced completely, genome rearrangement studies based on genome-wide analysis of gene orders play an important role in reconstructing phylogenetic trees. In the past, a variety of traditional rearrangement operations, such as reversals, transpositions, block-interchanges, translocations, fissions and fusions, have been proposed to evaluate the evolution distance between two related genomes in gene order. Usually, the computational studies of genome rearrangements are formulated as a problem of sorting a permutation by rearrangement operations. In this thesis, we introduce a problem, called as a sorting problem by cut-circularize-linearize-and-paste (CCLP) operations, which aims to find a minimum number of CCLP operations to sort a signed permutation representing a chromosome. The CCLP operation is a genome rearrangement operation that cuts a segment out of a chromosome, circularizes the segment into a temporary circle, linearizes the temporary circle as a linear segment, and possibly inverts the linearized segment and pastes it into the remaining chromosome. The CCLP operation can model many well-known rearrangements mentioned above, such as reversals, transpositions and block-interchanges, and others not reported in the biological literature. To distinguish those CCLP operations from the reversal, we call them as non-reversal CCLP operations. Finally, we propose an O(δn) time algorithm for solving the weighted sorting problem by CCLP operations when the weight ratio between reversals and non-reversal CCLP operations is 1:2, where n is the number of genes and δ is the number of needed CCLP operations.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079851515
http://hdl.handle.net/11536/48207
顯示於類別:畢業論文


文件中的檔案:

  1. 151501.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。