Accurate prediction of enzyme subfamily class using an adaptive fuzzy k-nearest neighbor method

doi:10.1016/j.biosystems.2006.10.004

Full metadata record

DC Field	Value	Language
dc.contributor.author	Huang, Wen-Lin	en_US
dc.contributor.author	Chen, Hung-Ming	en_US
dc.contributor.author	Hwang, Shiow-Fen	en_US
dc.contributor.author	Ho, Shinn-Ying	en_US
dc.date.accessioned	2014-12-08T15:13:22Z	-
dc.date.available	2014-12-08T15:13:22Z	-
dc.date.issued	2007-09-01	en_US
dc.identifier.issn	0303-2647	en_US
dc.identifier.uri	http://dx.doi.org/10.1016/j.biosystems.2006.10.004	en_US
dc.identifier.uri	http://hdl.handle.net/11536/10347	-
dc.description.abstract	Amphiphilic pseudo-amino acid composition (Am-Pse-AAC) with extra sequence-order information is a useful feature for representing enzymes. This study first utilizes the k-nearest neighbor (k-NN) rule to analyze the distribution of enzymes in the Am-Pse-AAC feature space. This analysis indicates the distributions of multiple classes of enzymes are highly overlapped. To cope with the overlap problem, this study proposes an efficient non-parametric classifier for predicting enzyme subfamily class using an adaptive fuzzy r-nearest neighbor (AFK-NN) method, where k and a fuzzy strength parameter m are adaptively specified. The fuzzy membership values of a query sample Q are dynamically determined according to the position of Q and its weighted distances to the k nearest neighbors. Using the same enzymes of the oxidoreductases family for comparisons, the prediction accuracy of AFK-NN is 76.6%, which is better than those of Support Vector Machine (73.6%), the decision tree method C5.0 (75.4%) and the existing covariant-discriminate algorithm (70.6%) using a jackknife test. To evaluate the generalization ability of AFK-NN, the datasets for all six families of entirely sequenced enzymes are established from the newly updated SWISS-PROT and ENZYME database. The accuracy of AFK-NN on the new large-scale dataset of oxidoreductases family is 83.3%, and the mean accuracy of the six families is 92.1 %. (c) 2006 Elsevier Ireland Ltd. All rights reserved.	en_US
dc.language.iso	en_US	en_US
dc.subject	amino acid composition	en_US
dc.subject	enzyme subfamily class prediction	en_US
dc.subject	fuzzy theory	en_US
dc.subject	k-nearest neighbor	en_US
dc.subject	support vector machine	en_US
dc.title	Accurate prediction of enzyme subfamily class using an adaptive fuzzy k-nearest neighbor method	en_US
dc.type	Article	en_US
dc.identifier.doi	10.1016/j.biosystems.2006.10.004	en_US
dc.identifier.journal	BIOSYSTEMS	en_US
dc.citation.volume	90	en_US
dc.citation.issue	2	en_US
dc.citation.spage	405	en_US
dc.citation.epage	413	en_US
dc.contributor.department	生物科技學系	zh_TW
dc.contributor.department	生物資訊及系統生物研究所	zh_TW
dc.contributor.department	Department of Biological Science and Technology	en_US
dc.contributor.department	Institude of Bioinformatics and Systems Biology	en_US
dc.identifier.wosnumber	WOS:000250184500011	-
dc.citation.woscount	15	-
Appears in Collections:	Articles

Files in This Item:

000250184500011.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.