標題: 蛋白質結構、動態與其演化之間的關係
On the Relationship among Protein Structure, Dynamics and Evolution
作者: 黃存操
Huang, Tsun-Tsao
黃鎮剛
Echave, Julián
Hwang, Jenn-Kang
Echave, Julián
生物資訊及系統生物研究所
關鍵字: 蛋白質演化;演化速率;堆積密度;均方擾動;加權蛋白質接觸數目;平均區域突變壓力;彈性網路模型;柔韌性;Protein evolution;Site-specific evolutionary rate;Local packing density;Mean square fluctuation;Weighted contact number;Mean local mutational stress;Elastic network model;Flexibility
公開日期: 2013
摘要: 蛋白質的功能與結構限制了其演化的速率。對於一條蛋白質序列而言,由於每個殘基(residue)受到的演化壓力都不同,因此其演化速率(site-specific evolutionary rate)也都不一樣。前人的研究顯示,蛋白質的結構(structure)與動態(dynamics)皆是影響演化速率的重要因子。本實驗室先前的研究表明,影響蛋白質序列上每個殘基演化速率最重要的結構因子是堆積密度(local packing density)且與演化速率呈現負相關,也就是說堆積密度越高的殘基其演化速率越低,反之則演化較快;另一方面,研究顯示影響每個殘基的演化速率的動態因子則是其Cα原子的均方擾動(mean square fluctuation),也就是說越剛性(rigid)的殘基演化速率越低,反之則演化較快。為了解究竟是堆積密度還是均方擾動對于演化速率影響較大,本論文第一部分比較了兩者與演化速率關聯性高低,結果顯示比起均方擾動而言,堆積密度與演化速率有著更好的線性相關性。由於前人研究已闡明均方擾動大致反比於堆積密度,為了解何者是直接相關於演化速率,我們使用偏相關(partial correlation)分析。結果顯示當控制了堆積密度時,均方擾動與演化速率幾乎變成無相關;反之當控制了均方擾動時,堆積密度依舊與演化速率成負相關。此結果顯示出真正影響演化速率的是堆積密度而不是均方擾動。由此推論,比起蛋白質動態,結構才是真正影響的序列上每個殘基演化速率的決定性因子。 過往對於演化速率與結構或動態因子的關聯性都是藉由分析實驗數據歸納而出,所提出的詮釋並無提供明確的機制能說明為何蛋白質序列上每個殘基的結構或動態會與演化速率有所關聯。本論文的第二部分提出一套模型,能直接透過演繹來描述上述的關聯性。此模型基於有功能的蛋白質必須要有特定的結構,假設突變(mutations) 是對於蛋白質位能景觀圖的隨機擾動(random perturbations of protein’s potential energy landscape),使用基本統計物理(statistical physics)的觀念推導出某位置突變後的蛋白質被天擇(natural selection)接受的期望機率(expected probability)。藉此模型,我們推導出殘基的演化速率與堆積密度是線性負相關,而與均方擾動則是正相關但非線性。本模型提供了一個架構,讓未來能夠透過分析結構、動態與序列的資訊來研究蛋白質的演化。
The evolution of proteins is subject to their functional and structural constraints. For a protein sequence, the evolutionary rates of residues (site-specific evolutionary rates) vary since the selection pressures on sites are different. Previous studies have shown that both protein structure and dynamics are important factors influencing the site-specific evolutionary rates. Among structural factors, our colleagues showed that the major and actual structural determinant for the evolutionary rate of a residue is its local packing density (LPD), and the inversed value of LPD is generally positively correlated with the site-specific evolutionary rate. Based on the findings that residues with high LPD tend to be evolutionarily conserved, a structure-based interpretation indicates that protein structure is the major determinant of site-specific evolutionary rate. On the other hand, several studies have shown that mean square fluctuation (MSF), a dynamical property for residues, is positively correlated to residue’s evolutionary rate. Based on the findings that rigid residues tend to be evolutionarily conserved while flexible residues tend to be variable, a flexibility-based interpretation indicates that protein dynamics is the major determinant of evolutionary rate. To elucidate which factor is the major determinant, we compare the degrees of associations in the first part of this dissertation. Our results showed that LPD has better linear correlation with site-specific evolutionary rate than MSF. However, for MSF is generally inversely proportional to LPD, a relevant question is which one, LPD or MSF, is directly related to evolutionary rate. Our partial correlation analyses show that when LPD is controlled, there is very little remaining correlation between evolutionary rate and MSF; in contrast, when MSF is controlled, LPD is still negatively correlated with evolutionary rate. These results imply that the major and actual factor causing the variation of evolutionary rate is LPD. As a result, we infer that protein structure is the actual determinant of the site-specific evolutionary rate, not protein dynamics. The main drawback of the flexibility-based and the structure-based interpretations is that no mechanism is proposed. To this end, a mechanistic model is proposed in the second part of this dissertation. We model mutations as random perturbations of the parameters of the protein’s potential energy landscape, and model natural selection as a function of the probability that a mutant adopts a specific active conformation. Using basic statistical physics and certain simplified assumptions we derive the expected probability that a mutant samples a specific functional structure. According to this model, a site’s evolutionary rate will linearly and negatively depend on LPD, and will positively but not linearly depend on MSF. This model provides a possible unifying framework to study evolutionary protein divergence at the levels of sequence, structure, and dynamics.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079551806
http://hdl.handle.net/11536/74766
顯示於類別:畢業論文