標題: FRKAS: Knowledge Acquisition Using a Fuzzy Rule Base Approach to Insight of DNA-Binding Domains/Proteins
作者: Huang, Hui-Ling
Chang, Fang-Lin
Ho, Shinn-Jang
Shu, Li-Sun
Huang, Wen-Lin
Ho, Shinn-Ying
生物科技學系
生物資訊及系統生物研究所
Department of Biological Science and Technology
Institude of Bioinformatics and Systems Biology
關鍵字: DNA-binding domains;feature selection;fuzzy rules;genetic algorithm;knowledge acquisition;physicochemical properties;support vector machine
公開日期: 1-三月-2013
摘要: Numerous prediction methods of DNA-binding domains/proteins were proposed by identifying informative features and designing effective classifiers. These researches reveal that the DNA-protein binding mechanism is complicated and existing accurate predictors such as support vector machine (SVM) with position specific scoring matrices (PSSMs) are regarded as black-box methods which are not easily interpretable for biologists. In this study, we propose an ensemble fuzzy rule base classifier consisting of a set of interpretable fuzzy rule classifiers (iFRCs) using informative physicochemical properties as features. In designing iFRCs, feature selection, membership function design, and fuzzy rule base generation are all simultaneously optimized using an intelligent genetic algorithm (IGA). IGA maximizes prediction accuracy, minimizes the number of features selected, and minimizes the number of fuzzy rules to generate an accurate and concise fuzzy rule base. Benchmark datasets of DNA-binding domains are used to evaluate the proposed ensemble classifier of 30 iFRCs. Each iFRC has a mean test accuracy of 77.46%, and the ensemble classifier has a test accuracy of 83.33%, where the method of SVM with PSSMs has the accuracy of 82.81%. The physicochemical properties of the first two ranks according to their contribution are positive charge and Van Der Waals volume. Charge complementarity between protein and DNA is thought to be important in the first step of recognition between protein and DNA. The amino acid residues of binding peptides have larger Van Der Waals volumes and positive charges than those of non-binding ones. The proposed knowledge acquisition method by establishing a fuzzy rule-based classifier can also be applicable to predict and analyze other protein functions from sequences.
URI: http://hdl.handle.net/11536/21753
ISSN: 0929-8665
期刊: PROTEIN AND PEPTIDE LETTERS
Volume: 20
Issue: 3
起始頁: 299
結束頁: 308
顯示於類別:期刊論文