標題: | Fine-grained protein fold assignment by support vector machines using generalized npeptide coding schemes and jury voting from multiple-parameter sets |
作者: | Yu, CS Wang, JY Yang, JM Lyu, PC Lin, CJ Hwang, JK 生物科技學系 Department of Biological Science and Technology |
關鍵字: | support vector machines;fine-grained fold prediction;global sequence-coding scheme;n-peptide |
公開日期: | 1-三月-2003 |
摘要: | In the coarse-grained fold assignment of major protein classes, such as all-alpha, all-beta, alpha + beta, alpha/beta proteins, one can easily achieve high prediction accuracy from primary amino acid sequences. However, the fine-grained assignment of folds, such as those defined in the Structural Classification of Proteins (SCOP) database, presents a challenge due to the larger amount of folds available. Recent study yielded reasonable prediction accuracy of 56.0% on an independent set of 27 most populated folds. In this communication, we apply the support vector machine (SVM) method, using a combination of protein descriptors based on the properties derived from the composition of n-peptide and jury voting, to the fine-grained fold prediction, and are able to achieve an overall prediction accuracy of 69.6% on the same independent set-significantly higher than the previous results. On 10-fold cross-validation, we obtained a prediction accuracy of 65.3%. Our results show that SVM coupled with suitable global sequence-coding schemes can significantly improve the fine-grained fold prediction. Our approach should be useful in structure prediction and modeling. (C) 2003 Wiley-Liss, Inc. |
URI: | http://dx.doi.org/10.1002/prot.10313 http://hdl.handle.net/11536/28045 |
ISSN: | 0887-3585 |
DOI: | 10.1002/prot.10313 |
期刊: | PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS |
Volume: | 50 |
Issue: | 4 |
起始頁: | 531 |
結束頁: | 536 |
顯示於類別: | 期刊論文 |