完整後設資料紀錄
DC 欄位語言
dc.contributor.authorHuang, CDen_US
dc.contributor.authorLin, CTen_US
dc.contributor.authorPal, NRen_US
dc.date.accessioned2014-12-08T15:40:04Z-
dc.date.available2014-12-08T15:40:04Z-
dc.date.issued2003-12-01en_US
dc.identifier.issn1536-1241en_US
dc.identifier.urihttp://dx.doi.org/10.1109/TNB.2003.820284en_US
dc.identifier.urihttp://hdl.handle.net/11536/27364-
dc.description.abstractThe structure classification of proteins plays a very important role in bioinformatics, since the relationships and characteristics among those known proteins can be exploited to predict the structure of new proteins. The success of a classification system depends heavily on two things: the tools being used and the features considered. For the bioinformatics applications, the role of appropriate features has not been paid adequate importance. In this investigation we use three novel ideas for multiclass protein fold classification. First, we use the gating neural network, where each input node is associated with a gate. This network can select important features in an online manner when the learning goes on. At the beginning of the training, all gates are almost closed, i.e., no feature is allowed to enter the network. Through the training, gates corresponding to good features are completely opened while gates corresponding to bad features are closed more tightly, and some gates may be partially open. The second novel idea is to use a hierarchical learning architecture (HLA). The classifier in the first level of HLA classifies the protein features into four major classes: all alpha, all beta, alpha + beta, and alpha/beta. And in the next level we have another set of classifiers, which further classifies the protein features into 27 folds. The third novel idea is to induce the indirect coding features from the amino-acid composition sequence of proteins based on the N-gram concept. This provides us with more representative and discriminative new local features of protein sequences for multiclass protein fold classification. The proposed HLA with new indirect coding features increases the protein fold classification accuracy by about 12%. Moreover, the gating neural network is found to reduce the number of features drastically. Using only half of the original features selected by the gating neural network can reach comparable test accuracy as that using all the original features. The gating mechanism also helps us to get a better insight into the folding process of proteins. For example, tracking the evolution of different gates we can find which characteristics (features) of the data are more important for the folding process. And, of course, it also reduces the computation time.en_US
dc.language.isoen_USen_US
dc.subjectfeature extractionen_US
dc.subjectgating networken_US
dc.subjectN-gram codingen_US
dc.subjectprotein sequenceen_US
dc.subjectradial basis function network (RBFN)en_US
dc.subjectStructure Classification of Protein (SCOP)en_US
dc.subjectsupport vector machine (SVM)en_US
dc.titleHierarchical learning architecture with automatic feature selection for multiclass protein fold classificationen_US
dc.typeArticleen_US
dc.identifier.doi10.1109/TNB.2003.820284en_US
dc.identifier.journalIEEE TRANSACTIONS ON NANOBIOSCIENCEen_US
dc.citation.volume2en_US
dc.citation.issue4en_US
dc.citation.spage221en_US
dc.citation.epage232en_US
dc.contributor.department電控工程研究所zh_TW
dc.contributor.departmentInstitute of Electrical and Control Engineeringen_US
dc.identifier.wosnumberWOS:000189392900008-
dc.citation.woscount34-
顯示於類別:期刊論文


文件中的檔案:

  1. 000189392900008.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。