標題: 結構功能區域交互同源性為基之蛋白質功能區域及交互作用預測
Inferring Domain Annotated Protein-Protein Interactions through 3D-Domain Interologs
作者: 陳永強
楊進木
Jinn-Moon Yang
生物資訊及系統生物研究所
關鍵字: 蛋白質-蛋白質交互作用;功能區塊-功能區塊交互作用;基因表現側寫;成對位置加權矩陣;演化式側寫;基因體規模預測;protein-protein interactions;domain-domain interactions;gene-expression profiles;pairPSSM;evolutionary profiles;genome-scale predictions
公開日期: 2005
摘要: 蛋白質間的交互作用在生物體內複雜反應途徑中扮演重要角色之一。在後基因體時代,具備大規模找尋蛋白質蛋白質交互作用的能力是深入了解蛋白質網路的主要途徑之一。Lu等人提出”交互作用同源性對應(interologs mapping)”,大規模預測蛋白質蛋白質交互作用 — 即利用計算比較基因體學的方法,將大量蛋白質交互作用註解從一個物種對應到另外一個未經實驗方法註解的物種上。然而,在蛋白質交互作用中,通常都是經由特定的功能區域(domain)作物理性接合進而執行功能。目前解蛋白質結晶結構的速度日益進步,這些實驗資料使得目前十分適合利用已知結構蛋白質複合體預測蛋白質-蛋白質交互作用。 在此研究中,我們提出一個新的觀念 “結構功能區域交互同源性對應(3D-domain interologs mapping)”,預測蛋白質功能區塊及交互作用。結構功能區域交互同源性對應的定義為” 在一個已知結構的蛋白質結構上,蛋白質A的功能區域a與蛋白質B的功能區域b作物理接合,則他們在同一個物種中的同源蛋白質A’(具有功能區域a)以及B’(具有功能區域b)可能會發生交互作用”。我們主要的創新在於能夠快速的在數百個物種中進行基因體規模的蛋白質交互作用預測,並且發展一個新的成對位置加權矩陣(pairPSSM)。這個矩陣能夠利用演化式側寫提供不同的胺基酸對出現在某個特定位置的統計意義,使記分系統更加準確。我們的方法在分辨真實蛋白質複合體及不具生物意義蛋白質對的測試中可以達到將近九成的正確率。另外我們也嘗試預測酵母菌的蛋白質交互作用,和過去方法相比我們能夠提昇將近一成的預測準確率,而且這些蛋白質交互作用的平均基因表現相關性明顯高於不會發生交互作用的蛋白質對。最後,我們在七個常見的物種中,包含人類(Homo sapiens)、家鼠(Mus musculus)、大鼠(Rattus norvegicus)、線蟲(Caenorhabditis elegans)、果蠅(Drosophila melanogaster)、酵母菌(Saccharomyces cerevisiae)以及大腸桿菌(Escherichia coli)進行大規模蛋白質交互作用預測,從這些物種中可以預測到約四十五萬對新的蛋白質蛋白質交互作用,同時我們還能在這些蛋白質交互作用中提供交互作用功能區塊及接觸胺基酸對的註解。綜合以上所述,我們認為”結構功能區域交互同源性對應”及”成對位置加權矩陣”是一個具有實際應用價值的蛋白質蛋白質交互作用預測方法並能進一步研究蛋白質交互作用網路。
The interaction between proteins is one of the most important features to most biological processes. In the postgenomic era, the ability to identify protein-protein interactions on a genomic scale is very important to determine networks of protein interactions. To predict protein interactions large-scalely, Lu et al. presented “interologs mapping”, — predicting protein-protein interactions from one organism to another by using computational comparative genomics. However, behind protein interactions there are protein domains interacting physically with one another to perform the specific functions. According to the increasing number of solved structures involving protein complexes, it is ripe to test putative interactions on complexes of known 3D structures. In this study, we proposed a new concept “3D-domain interologs mapping” to inferred domain-annotated protein interactions. The 3D-domain interologs mapping is defined as “Domain a (in chain A) interacts with domain b (in chain B) in a 3D complex, their inferring protein pair A' (containing domain a) and B' (containing domain b) in the same species would be likely to interact with each other if both protein pairs (A' and A as well as proteins B and B') are homologous ” The key novelties of our method are fast genome-scale prediction across hundreds of organisms and construction of a pair Position Specific Scoring Matrix (pairPSSM). This matrix is able to provide statistical significance of residue pairs at various contact positions by evolutionary profiles, leading to a more sensitive scoring system. Our method successfully distinguishes the true protein complexes and unreasonable protein pairs with about 90% accuracy. We also evaluate our method in yeast proteome and get about 10% improvements than previous methods. The mean correlation of the gene expression profiles of our predictions is significantly higher than that for non-interacting protein pairs in S. cerevisiae. Finally, our method applies to seven organisms commonly used in molecular research, including Homo sapiens, Mus musculus, Rattus norvegicus, Caenorhabditis elegans, Drosophila melanogaster, Saccharomyces cerevisiae and Escherichia coli. In these seven organisms, our method predicts ~450,000 new interactions in which the interacting domains and residues are automatically modeled. In conclusion, this study suggests that 3D-domain interologs mapping and pairPSSM are useful methods for predicting protein-protein interactions and detailed analyzing networks of protein interactions.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT009351507
http://hdl.handle.net/11536/79860
顯示於類別:畢業論文


文件中的檔案:

  1. 150701.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。