標題: 利用重疊基因建構原核生物的基因體樹之研究
On the Study of Constructing Genome Trees of Prokaryotes Based on Overlapping Genes
作者: 姜禮瑋
Li-Wei Jiang
盧錦隆
Chin Lung Lu
生物資訊及系統生物研究所
關鍵字: 生物資訊;演算法;基因體樹;直系同源;重疊基因;bioinformatics;algorithm;genome tree;ortholog;overlapping gene
公開日期: 2007
摘要: 隨著DNA定序技術的進步,越來越多物種的完整基因體序列變得更容易取得。因此,藉由完整基因體來建構出物種之間的演化樹,將有助於了解物種演化的親屬關係。除了以序列為主的方法之外,還有利用整個基因體基因內容和基因次序,這些都能被用來建構出更準確和穩定的演化樹。然而已有文獻指出,只利用基因內容或基因次序來建構微生物的基因體樹可能是不合適的。為了克服這個問題,Luo所屬的研究團隊最近提出一個利用重疊基因的內容來建構出細菌演化樹的新方法。所謂的重疊基因是指在染色體位置相鄰的兩個基因,它們的序列會部份或全部重疊。實際上,重疊基因在微生物的基因體上是非常普遍的,而且他們比非重疊基因在演化上是更具有保留性的,這意味著重疊基因在微生物中是比非重疊基因更適合當作建構物種演化關係的特徵。事實上,物種的基因在演化過程中是會很容易地發生基因體的重組,這導致了即使在兩個親屬關係很近的物種上,他們之間的直向同源基因的次序可能會不同,這當然也會造成他們之間的直向同源重疊基因的次序也會不同。這似乎意味著不僅是重疊基因的內容而且重疊基因的次序也應該被考慮用來建構原核生物的基因體樹。因此,在這篇論文中,我們結合在物種整個基因體上重疊基因的內容與次序定義出一個新的衡量兩個基因體之間距離的方法,我們稱之為重疊基因距離。然後我們根據原核生物基因體兩兩之間的重疊基因距離並且使用UPGMA、NJ和FM的方法來建構出他們之間的基因體樹。 根據上面所描述的方法,我們發展出一個網站伺服器的工具稱之為OGtree,其可利用原核生物整個基因體之間的重疊基因距離建構出原核生物的基因體樹。除此之外,我們也利用一些蛋白細菌的基因體來測試OGtree在建構基因體樹的品質。相較於Luo所屬的研究團隊所建構出的演化樹,我們OGtree所建構出來的基因體樹與利用16s rRNA以及串接多個蛋白質序列所建構出來的演化樹是相當一致的。這些結果已說明了我們的OGtree可以做為一個有用的工具來建構出更準確與更穩定的原核生物基因體樹。
As more and more complete genomes of species are available, phylogenetic tree inference by comparing whole genome can be helpful for the reconstruction of evolutionary relationships of species. In addition to sequence-based phylogenomic approaches, methods based on whole genomes, like those based on gene content and gene orders, can be used to construct more precise and robust phylogenetic trees. However, it has been reported in the literature that the genome trees constructed only based on gene content or gene order may not be suitable for microbial genomes. To address these problems, Luo et al. [6, 7] have recently proposed an alternative way to reconstruct genome trees of bacteria using a measure based on the presence and absence of overlapping genes. The overlapping genes (OGs) are defined as adjacent genes whose coding sequences overlap partially or entirely. Actually, OGs are ubiquitous in microbial genomes and more conserved between species than non-overlapping genes, implying that OGs can serve as better phylogenetic characters than non-overlapping genes for reconstructing the evolutionary relationships among microbial genomes. In fact, during evolutionary process, species genomes are subject to genome rearrangements that alter the order and orientation of genes on the genomes, leading to that the orders of orthologous genes, as well as the ones of orthologous OG pairs certainly, even between two closely related species may not be conserved. This suggests that not only OG content but also orthologous OG order should be considered to reconstruct the genome trees of prokaryotic species. Therefore, in this thesis, we define a new distance measure, called as overlapping-gene distance, between two genomes based on a combination of OG content and OG order in their whole genomes. We then use UPGMA, as well as NJ and FM (Fitch-Margolias), to build the genome tree of prokaryotic genomes according to their pairwise OG distance. Based on the method described above, we have implemented a web-based tool, called OGtree, for constructing the genome trees of prokaryotes based on OG distance between prokaryotic complete genomes. In addition, we have tested our OGtree on several Proteobacteria complete genomes to assess its quality of genome tree reconstruction. Compared with the phylogenetic trees produced by Luo et al. [6, 7], the genome trees constructed by our OGtree are quite consistent with those reference trees that were reconstructed based on 16S rRNAs as well as concatenation of multiple proteins. All these results have suggested that our OGtree can serve as a useful tool for constructing more precise and robust genome trees for prokaryotic genomes.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT009551509
http://hdl.handle.net/11536/39433
顯示於類別:畢業論文


文件中的檔案:

  1. 150901.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。