標題: | Predicting disulfide connectivity patterns |
作者: | Lu, Chih-Hao Chen, Yu-Ching Yu, Chin-Sheng Hwang, Jenn-Kang 生物資訊及系統生物研究所 資訊工程學系 Institude of Bioinformatics and Systems Biology Department of Computer Science |
關鍵字: | disulfide bond;disulfide connectivity pattern;support vector machine;genetic algorithm;feature selection |
公開日期: | 1-五月-2007 |
摘要: | Disulfide bonds play an important role in stabilizing protein structure and regulating protein function. Therefore, the ability to infer disulfide connectivity from protein sequences will be valuable in structural modeling and functional analysis. However, to predict disulfide connectivity directly from sequences presents a challenge to computational biologists due to the nonlocal nature of disulfide bonds, i.e., the close spatial proximity of the cysteine pair that forms the disulfide bond does not necessarily imply the short sequence separation of the cysteine residues. Recently, Chen and Hwang (Proteins 2005;61:507-512) treated this problem as a multiple class classification by defining each distinct disulfide pattern as a class. They used multiple support vector machines based on a variety of sequence features to predict the disulfide patterns. Their results compare favorably with those in the literature for a benchmark dataset sharing less than 30% sequence identity. However, since the number of disulfide patterns grows rapidly when the number of disulfide bonds increases, their method performs unsatisfactorily for the cases of large number of disulfide bonds. In this work, we propose a novel method to represent disulfide connectivity in terms of cysteine pairs, instead of disulfide patterns. Since the number of bonding states of the cysteine pairs is independent of that of disulfide bonds, the problem of class explosion is avoided. The bonding states of the cysteine pairs are predicted using the support vector machines together with the genetic algorithm optimization for feature selection. The complete disulfide patterns are then determined from the connectivity matrices that are constructed from the predicted bonding states of the cysteine pairs. Our approach outperforms the current approaches in the literature. |
URI: | http://dx.doi.org/10.1002/prot.21309 http://hdl.handle.net/11536/10819 |
ISSN: | 0887-3585 |
DOI: | 10.1002/prot.21309 |
期刊: | PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS |
Volume: | 67 |
Issue: | 2 |
起始頁: | 262 |
結束頁: | 270 |
顯示於類別: | 期刊論文 |