標題: 利用加權斷點距離建構原核生物的演化樹
Reconstructing Phylogenetic Trees of Prokaryotes Based on Weighted Breakpoint Distance
作者: 楊忠翰
Yang, Chung-Han
盧錦隆
Lu, Chin-Lung
生物資訊及系統生物研究所
關鍵字: 生物資訊;基因體樹;原核生物;加權斷點;Bioinformatics;Genome tree;Prokaryotes;Weighted breakpoint
公開日期: 2008
摘要: 隨著DNA定序技術的發展,越來越多原核生物物種的完整基因體序列變得更加容易取得。因此,這給予我們一個機會得以藉由比較原核物種基因體之間的基因次序來推測出物種之間基因體規模的演化樹。在過去的研究中,一些利用基因次序的方法像是斷點距離可以用來建構出物種之間的演化關係。當一基因體其一組鄰近基因對的基因次序與在另一基因體上的直向同源基因對其基因次序不一致時,這被認為該鄰近基因對發生一次斷點,兩基因體之間的斷點總數量則為基因體之間的斷點距離。在這傳統的斷點距離中,假設所有在基因體上斷點的發生機率皆視為相同,然而已有文獻指出鄰近基因對可以被分為重組速率快或是重組速率慢的基因對。舉個例子來說,屬於同一個操作組的基因對會比屬於不同操作組的基因對更具有保留性。通常重組速率慢的基因對其彼此之間的距離較近,反之重組速率快的基因對其彼此之間的距離較遠。根據以上所描述的特性,在這份研究中我們只考慮位於同股的鄰近基因對,並根據斷點是發生在重組速率快或是重組速率慢的基因對將斷點區分為長距離的斷點或是短距離的斷點這兩種類型。由於不同類型的斷點,其發生的機率不一樣,根據這樣的特性我們也定義出一個加權斷點距離並用此方法來衡量兩個原核生物基因體之間的演化距離。另外,我們發展出一個網站伺服器的工具稱之為wBPtree,其可利用原核生物整個基因體之間的重疊基因距離建構出原核生物的演化樹。除此之外,我們也利用一些蛋白細菌的基因體來測試wBPtree在建構演化樹的品質。相較於傳統的斷點距離所建構出的演化樹,我們wBPtree所建構出來的演化樹與參考樹(Eugeni Belda et al. 所屬研究團隊利用串接多個蛋白質序列所建構出來的演化樹)是相當一致的。這些結果已說明了我們的wBPtree可以做為一個有用的工具來建構出更準確與更穩定的原核生物基因體樹。
As more and more complete genomes of prokaryotes are available, it provides us with an opportunity to reconstruct their genome trees based on a genome-scale phylogenetic inference by comparing gene orders between prokaryotic genomes. In the previous studies, some methods based on gene order, such as breakpoint distance, could be useful for reconstruction of the evolutionary relationships of species. It is considered that a breakpoint occurs when the gene order of an adjacent gene pair in a genome is different than that of its orthologous gene pair in another genome. The total number of breakpoints between two genomes is the breakpoint distance for these two genomes. In this original breakpoint distance, it is assumed that all the breakpoints on a genome have the same probability to occur. However, it has been reported in the literature that adjacent gene pairs can be divided into two classes of fast- and slow-rearranging pairs. For example, a gene pair within an operon is more conservative than a gene pair whose genes are from different operons. Usually, the distance between the genes in a slow-rearranging pair is short and the distance between the genes in a fast-rearranging pair is long. Based on the property described above, we consider only about those adjacent gene pairs that are on the same strand in this study and further divide their breakpoints into two types that are short-distance breakpoints and long-distance breakpoints. Because the occurrence probabilities of short-distance breakpoints and long-distance breakpoints are different, we define a weighted breakpoint distance by assigning different weights to short- and long- distance breakpoints and use it to measure the evolutionary distance between two prokaryotic genomes. In addition, we have implemented a web-based tool, called wBPtree, for constructing the genome trees of prokaryotes based on weighted breakpoint distance between prokaryotic complete genomes. We have also tested our wBPtree on several Proteobacteria complete genomes to assess its quality of genome tree reconstruction. Compared with the phylogenetic trees produced by original breakpoint distance, the genome trees constructed by our wBPtree are quite consistent with the reference trees that were reconstructed based on concatenation of multiple proteins. All these results have suggested that our wBPtree can serve as a useful tool for constructing more precise and robust genome trees for prokaryotic genomes.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079651510
http://hdl.handle.net/11536/43269
顯示於類別:畢業論文


文件中的檔案:

  1. 151001.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。