標題: | Prediction of the bonding states of cysteines using the support vector machines based on multiple feature vectors and cysteine state sequences |
作者: | Chen, YC Lin, SC Lin, CJ Hwang, JK 生物科技學系 生物資訊及系統生物研究所 Department of Biological Science and Technology Institude of Bioinformatics and Systems Biology |
關鍵字: | support vector machines;disulfide bonds;cysteine state sequences;multiple feature vectors |
公開日期: | 1-Jun-2004 |
摘要: | The support vector machine (SVM) method is used to predict the bonding states of cysteines. Besides using local descriptors such as the local sequences, we include global information, such as amino acid compositions and the patterns of the states of cysteines (bonded or nonbonded), or cysteine state sequences, of the proteins. We found that SVM based on local sequences or global amino acid compositions yielded similar prediction accuracies for the data set comprising 4136 cysteine-containing segments extracted from 969 nonhomologous proteins. However, the SVM method based on multiple feature vectors (combining local sequences and global amino acid compositions) significantly improves the prediction accuracy, from 80% to 86%. If coupled with cysteine state sequences, SVM based on multiple feature vectors yields 90% in overall prediction accuracy and a 0.77 Matthews correlation coefficient, around 10% and 22% higher than the corresponding values obtained by SVM based on local sequence information. (C) 2004Wiley-Liss, Inc. |
URI: | http://dx.doi.org/10.1002/prot.20079 http://hdl.handle.net/11536/26696 |
ISSN: | 0887-3585 |
DOI: | 10.1002/prot.20079 |
期刊: | PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS |
Volume: | 55 |
Issue: | 4 |
起始頁: | 1036 |
結束頁: | 1042 |
Appears in Collections: | Articles |
Files in This Item:
If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.