标题: | Predicting protein subnuclear localization using GO-amino-acid composition features |
作者: | Huang, Wen-Lin Tung, Chun-Wei Huang, Hui-Ling Ho, Shinn-Ying 生物科技学系 生物资讯及系统生物研究所 Department of Biological Science and Technology Institude of Bioinformatics and Systems Biology |
关键字: | Gene Ontology;Subnuclear localization;Amino acid composition |
公开日期: | 1-十一月-2009 |
摘要: | The nucleus guides life processes of cells. Many of the nuclear proteins participating in the life processes tend to concentrate on subnuclear compartments. The subnuclear localization of nuclear proteins is hence important for deeply understanding the construction and functions of the nucleus. Recently, Gene Ontology (GO) annotation has been used for prediction of subnuclear localization. However, the effective use of GO terms in solving sequence-based prediction problems remains challenging, especially when query protein sequences have no accession number or annotated GO term. This study obtains homologies of query proteins with known accession numbers using BLAST to retrieve GO terms for sequence-based subnuclear localization prediction. A prediction method PGAC, which involves mining informative GO terms associated with amino acid composition features, is proposed to design a support vector machine-based classifier. PGAC yields 55 informative GO terms with training and test accuracies of 85.7% and 76.3%, respectively, using a data set SNL-35 (561 proteins in 9 localizations) with 35% sequence identity. Upon comparison with Nuc-PLoc, which combines amphiphilic pseudo amino acid composition of a protein with its position-specific scoring matrix, PGAC using the data set SNL_80 yields a leave-one-out cross-validation accuracy of 81.1%, which is better than that of Nuc-PLoc, 67.4%. Experimental results show that the set of informative GO terms are effective features for protein subnuclear localization. The prediction server based on PGAC has been implemented at http://iclab.life.nctu.edu.tw/prolocgac. (C) 2009 Elsevier Ireland Ltd. All rights reserved. |
URI: | http://dx.doi.org/10.1016/j.biosystems.2009.06.007 http://hdl.handle.net/11536/6448 |
ISSN: | 0303-2647 |
DOI: | 10.1016/j.biosystems.2009.06.007 |
期刊: | BIOSYSTEMS |
Volume: | 98 |
Issue: | 2 |
起始页: | 73 |
结束页: | 79 |
显示于类别: | Articles |
文件中的档案:
If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.