標題: 利用計算方法識別蛋白質之乙醯基化位置
Computational Identification of Protein Acetylation Sites
作者: 許伯瑲
Po-Chiang Hsu
黃憲達
Hsien-Da Huang
分子醫學與生物工程研究所
關鍵字: 乙醯基化;支援向量機;Acetylation;Support Vector Machine;SVM
公開日期: 2007
摘要: 對於生物體內的許多生理功能而言,蛋白質之乙醯基化修飾是一種非常重要且可逆的轉錄後修飾作用,它影響包括酵素的活性及穩定、蛋白質交互作用、蛋白質與去氧核糖核酸之鍵結、去氧核糖核酸之修復、蛋白質轉錄作用的調控、細胞凋亡、細胞因子信號傳遞及細胞核物質的輸入。因為利用生物實驗來辨識蛋白質之乙醯基化極為曠日廢時且耗費實驗資源,為了能有效並實用的辨識蛋白質之乙醯基化以供往後的研究使用,我們分析了蛋白質乙醯基化的受質特異性並提出了一個名為N-Ace的蛋白質之乙醯基化發生位置的辨識系統,來用於辨識蛋白質序列中的丙胺酸(Ala)、甘胺酸(Gly)、賴胺酸(Lys)、蛋胺酸(Met)、絲胺酸(Ser)及蘇胺酸(Thr)之乙醯基化。我們利用已知乙醯基化位置的蛋白質序列、結構特徵、物理及化學的特性,如:蛋白質序列、可接觸表面積、亂度、能量、分子重量、蛋白質序列中胺基酸的出現比率、空間參數、疏水性、體積、極性、電荷、熱含量及等電點,並結合支援向量機來訓練計算模型。在模型建立完後,我們使用K-Fold交叉驗証可證實這些特徵與乙醯基化的發生位置有明顯的關係。此外,蛋白質之乙醯基化的各別的準確率分別為丙胺酸(Ala)84%、甘胺酸(Gly) 85%、賴胺酸(Lys)76%、蛋胺酸(Met) 94%、絲胺酸(Ser) 81%及蘇胺酸(Thr) 77%。最後,我們將最佳準確率的模型整合並建立成一個網頁介面的工具,以供使用者利用。
Protein acetylation, which is an important and reversible post-translational modification, affects essential biological processes, including enzymatic activity, stability, protein-protein interaction, DNA binding, DNA repair, transcription regulation, apoptosis, cytokine signaling, and nuclear import. However, experimental identification of acetylation sites is time-consuming and lab-intensive. In order to identify the protein acetylation sites that could be useful and insightful for further analysis, we investigate the substrate specificity of acetylated sites and propose a method, namely N-Ace, for identifying acetylation sites on alanine, glycine, lysine, methionine, serine, and threonine. Support Vector Machine (SVM) is adapted to learn the computational models with the features of amino acids, structural characteristics, and physicochemical properties surrounding the acetylation sites. K-fold cross-validation indicates that the structural features, such as accessible surface area (ASA), and physical and chemical properties, such as absolute entropy, non-bonded energy, size, amino acid composition, steric parameter, hydrophobicity, volume, mean polarity, electric charge, heat capacity and isoelectric point are involved in substrate site specificity. The predictive accuracies of acetylated alanine, glycine, lysine, methionine, serine, and threonine are 84%, 85%, 76%, 94%, 81% and 77%, respectively. Finally, the constructed models with highest accuracy are used to implement a web-based prediction tool.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT009529507
http://hdl.handle.net/11536/39050
Appears in Collections:Thesis


Files in This Item:

  1. 950701.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.