標題: 以模板為基礎的新能量函式預測蛋白質交互作用
A Novel Template-Based Scoring Function for Predicting Protein-protein Interaction
作者: 羅宇書
Yu-Shu Lo
楊進木
Jinn-Moon Yang
生物資訊及系統生物研究所
關鍵字: 蛋白質交互作用;protein-protein interaction
公開日期: 2007
摘要: 蛋白質間的交互作用在生物體內複雜的反應途徑中扮演重要的角色。在後基因體時代,發展大規模尋找蛋白質交互作用的能力是深入了解蛋白質網路的主要途徑之一。Lu等人提出“交互作用同源性對應(interologs mapping)”方法,大規模地預測蛋白質間交互作用,可將某一物種中已知的大量蛋白質交互作用對應到另一個僅有少量交互作用資料的物種上。然而,在交互作用的蛋白質中,通常都是經由特定的功能區域(domain)與其他的蛋白質進行物理性接合。目前解蛋白質結晶結構的技術日益進步,大量的實驗資料使得對於利用已知結構蛋白質複合體預測蛋白質交互作用的方法有著極大的幫助。在此研究中,我們提出一個新的概念“結構功能區域交互作用同源性對應(3D-domain interologs mapping)”,此外我們並發展了一個以模板(template)為基礎的能量預測函式運用於此概念。我們新提出的能量函式包含兩項核心概念:第一,此能量函式的預測結果是建立在可採信的序列比對上。我們提供了合理的條件,使得序列比對與結構比對的相似度在一定水準以上。第二,新的能量函式著重於模板上的特殊作用胺基酸,會依照模板上所形成的特殊作用力(例如:氫鍵、靜電力、跨物種的演化保留性)來預測同源性蛋白質交互作用。我們發現在序列相似度高於20%且交互作用胺基酸對應(aligned)比例高於80%的條件下,可以使得胺基酸序列比對與結構比對結果有合理的一致性。而我們自ASEdb挑選了275個交互作用胺基酸,能量函式預測的分數與實際能量變化的相關係數高達0.92,證實我們的新能量函式計算結果與蛋白質結合表面(binding interface)的能量變化具有高度相關性。最後我們將結構功能區域交互作用的同源性對應方法與新的能量函式應用於以人類的蛋白質交互作用預測酵母菌的蛋白質交互作用。與先前研究者提出的“一般性區域交互作用同源性對應(generalized interologs mapping)”相比較,我們的預測方法具有更高的準確度。
The interaction between proteins is one of the most important features to most biological process. In the post genomic era, genome-scale identification of protein-protein interactions on is very important to determine network of protein interactions. To predict protein-protein interactions large-scalely, Lu et al. (2003) presented “interologs mapping”, predicting protein-protein interactions from one organism to another by using computational comparative genomics. Most often, it is only a fraction of a protein (i.e., domain) that directly interacts with its biological partners. According to the increasing number of 3D structures involving protein complexes, it is ripe to test putative domain-domain interactions on known 3D-complexes.In this study, we proposed a new concept “3D-domain interologs mapping” to inferred domain-domain interactions. Additionally, we develop a new scoring function to apply into the concept. This scoring function is highly dependent on the 3D-structure templates. Our new template-based scoring function has two major issues. Firstly, because the calculating results of current template-based scoring functions are based on sequence alignments, we provide reliable criteria for assessing the sequence alignments for scoring. Secondly, our scoring function focus on the characteristics of the templates (e.g., H-bond, electronic force, and conservation). We find that sequence identity higher than 0.2 and aligned ratio of contact residue higher than 0.8 are a good criteria to achieve reliable consistence ratio between sequence and structure alignments. Moreover, the scores of our new scoring function could have a highly correlation with binding energy. The correlation between experimental energies and predicted binding affinities of our scoring function is 0.92 on 275 mutated residues from the ASEdb. Finally, we apply our new concept and scoring function on predicting protein-protein interaction from human to yeast. Compared with “generalized interologs mapping”, our method has much higher accuracy in prediction.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT009551506
http://hdl.handle.net/11536/39431
顯示於類別:畢業論文