完整後設資料紀錄
DC 欄位語言
dc.contributor.author李光成en_US
dc.contributor.authorPhasit Charoenkwanen_US
dc.contributor.author何信瑩en_US
dc.contributor.authorShinn-Ying Hoen_US
dc.date.accessioned2014-12-12T02:42:22Z-
dc.date.available2014-12-12T02:42:22Z-
dc.date.issued2013en_US
dc.identifier.urihttp://140.113.39.130/cdrfb3/record/nctu/#GT079855863en_US
dc.identifier.urihttp://hdl.handle.net/11536/75086-
dc.description.abstractThe massive growth of protein sequence and neuron image datasets leads to the need of computation-based methods to predict and analyse their biological functions. To predict protein functions and recognize neurons images, machine-learning-based classifiers are regularly suggested. In present, the desired predictor of protein functions should provide both prediction efficiency and knowledge discovery. Meanwhile, the identification of informative features for recognizing neuron images is not easy due to a large number of available image features. This dissertation develops optimization methodologies for both predicting protein sequences and recognizing neuron images based on an intelligent genetic algorithm (IGA). The scoring card method (SCM) is a simple and highly interpretable method for prediction and analysis of protein functions. The SCM calculates dipeptides propensity scores of an interested protein function from the difference of dipeptide compositions between positive and negative sequences. The propensity scores of 400 dipeptides are optimized by IGA to enhance prediction accuracy while conserving the original characteristics of amino acid composition. A sequence score is derived by utilizing these propensity scores to predict its protein function. Two SCM-based methods, SCMSOL and SCMCRYS, are proposed for prediction and analysis of protein solubility and crystallizability, and their tests accuracies are 84.3% and 76.1%, respectively, which are comparable to the support vector machine based methods using the same dipeptide composition features. Moreover, the biological knowledge discovery and mutagenesis analysis for soluble and crystallizable proteins from the propensity scores are illustrated. The procedure of developing SCM-based methods for protein function prediction can also be applied to design other methods for predicting protein functions with high prediction performance and high interpretable results. This dissertation also presents an automated neuron image feature identification system (Auto-NIFI) which is a user-friendly tool for automatically extracting and identifying a small set of informative neuron image features utilizing an inheritable bi-objective combinatorial genetic algorithm (IBCGA). The feature selection of Auto-NIFI allows biologists to construct a suitable classifier for particular neuron image classification problems. To identify neuron image features, Auto-NIFI provides a comprehensive set of image feature extraction modules together with the IBCGA feature selection modules. Notably, according to the huge collection of image feature extraction modules available in this tool, this system is also capable of applying to a wide variety of biological image classification problems. Two methods, HCS-Neurons and DescNeuro, are proposed for neuron image classification. In the HCS-Neurons method, the usefulness of Auto-NIFI is demonstrated in identifying phenotypic changes in multi-neuron images upon response to drug treatments of high-content screening. The identified three features of morphology were able to achieve an independent accuracy of 90.28% for recognizing neurons into six classes corresponding to six different nocodazole drug concentrations. By using the Auto-NIFI, DescNeuro can recognize a neuron in the 3D Drosophila neuron database from a 2D image with promising recognition results.zh_TW
dc.description.abstractThe massive growth of protein sequence and neuron image datasets leads to the need of computation-based methods to predict and analyse their biological functions. To predict protein functions and recognize neurons images, machine-learning-based classifiers are regularly suggested. In present, the desired predictor of protein functions should provide both prediction efficiency and knowledge discovery. Meanwhile, the identification of informative features for recognizing neuron images is not easy due to a large number of available image features. This dissertation develops optimization methodologies for both predicting protein sequences and recognizing neuron images based on an intelligent genetic algorithm (IGA). The scoring card method (SCM) is a simple and highly interpretable method for prediction and analysis of protein functions. The SCM calculates dipeptides propensity scores of an interested protein function from the difference of dipeptide compositions between positive and negative sequences. The propensity scores of 400 dipeptides are optimized by IGA to enhance prediction accuracy while conserving the original characteristics of amino acid composition. A sequence score is derived by utilizing these propensity scores to predict its protein function. Two SCM-based methods, SCMSOL and SCMCRYS, are proposed for prediction and analysis of protein solubility and crystallizability, and their tests accuracies are 84.3% and 76.1%, respectively, which are comparable to the support vector machine based methods using the same dipeptide composition features. Moreover, the biological knowledge discovery and mutagenesis analysis for soluble and crystallizable proteins from the propensity scores are illustrated. The procedure of developing SCM-based methods for protein function prediction can also be applied to design other methods for predicting protein functions with high prediction performance and high interpretable results. This dissertation also presents an automated neuron image feature identification system (Auto-NIFI) which is a user-friendly tool for automatically extracting and identifying a small set of informative neuron image features utilizing an inheritable bi-objective combinatorial genetic algorithm (IBCGA). The feature selection of Auto-NIFI allows biologists to construct a suitable classifier for particular neuron image classification problems. To identify neuron image features, Auto-NIFI provides a comprehensive set of image feature extraction modules together with the IBCGA feature selection modules. Notably, according to the huge collection of image feature extraction modules available in this tool, this system is also capable of applying to a wide variety of biological image classification problems. Two methods, HCS-Neurons and DescNeuro, are proposed for neuron image classification. In the HCS-Neurons method, the usefulness of Auto-NIFI is demonstrated in identifying phenotypic changes in multi-neuron images upon response to drug treatments of high-content screening. The identified three features of morphology were able to achieve an independent accuracy of 90.28% for recognizing neurons into six classes corresponding to six different nocodazole drug concentrations. By using the Auto-NIFI, DescNeuro can recognize a neuron in the 3D Drosophila neuron database from a 2D image with promising recognition results.en_US
dc.language.isoen_USen_US
dc.subject預測蛋白質zh_TW
dc.subject辨識神經細胞影像zh_TW
dc.subject特徵擷取zh_TW
dc.subject特徵選取zh_TW
dc.subject最佳化zh_TW
dc.subjectProtein predictionen_US
dc.subjectNeuron image classificationen_US
dc.subjectFeature extractionen_US
dc.subjectFeature selectionen_US
dc.subjectOptimizationen_US
dc.title設計最佳化演算法預測蛋白質功能和辨認神經細胞影像zh_TW
dc.titleDesigning Optimization Methods to Predict Protein Functions and Recognize Neuron Imagesen_US
dc.typeThesisen_US
dc.contributor.department生物資訊及系統生物研究所zh_TW
顯示於類別:畢業論文