設計最佳化演算法預測蛋白質功能和辨認神經細胞影像

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.author	李光成	en_US
dc.contributor.author	Phasit Charoenkwan	en_US
dc.contributor.author	何信瑩	en_US
dc.contributor.author	Shinn-Ying Ho	en_US
dc.date.accessioned	2014-12-12T02:42:22Z	-
dc.date.available	2014-12-12T02:42:22Z	-
dc.date.issued	2013	en_US
dc.identifier.uri	http://140.113.39.130/cdrfb3/record/nctu/#GT079855863	en_US
dc.identifier.uri	http://hdl.handle.net/11536/75086	-
dc.description.abstract	The massive growth of protein sequence and neuron image datasets leads to the need of computation-based methods to predict and analyse their biological functions. To predict protein functions and recognize neurons images, machine-learning-based classifiers are regularly suggested. In present, the desired predictor of protein functions should provide both prediction efficiency and knowledge discovery. Meanwhile, the identification of informative features for recognizing neuron images is not easy due to a large number of available image features. This dissertation develops optimization methodologies for both predicting protein sequences and recognizing neuron images based on an intelligent genetic algorithm (IGA). The scoring card method (SCM) is a simple and highly interpretable method for prediction and analysis of protein functions. The SCM calculates dipeptides propensity scores of an interested protein function from the difference of dipeptide compositions between positive and negative sequences. The propensity scores of 400 dipeptides are optimized by IGA to enhance prediction accuracy while conserving the original characteristics of amino acid composition. A sequence score is derived by utilizing these propensity scores to predict its protein function. Two SCM-based methods, SCMSOL and SCMCRYS, are proposed for prediction and analysis of protein solubility and crystallizability, and their tests accuracies are 84.3% and 76.1%, respectively, which are comparable to the support vector machine based methods using the same dipeptide composition features. Moreover, the biological knowledge discovery and mutagenesis analysis for soluble and crystallizable proteins from the propensity scores are illustrated. The procedure of developing SCM-based methods for protein function prediction can also be applied to design other methods for predicting protein functions with high prediction performance and high interpretable results. This dissertation also presents an automated neuron image feature identification system (Auto-NIFI) which is a user-friendly tool for automatically extracting and identifying a small set of informative neuron image features utilizing an inheritable bi-objective combinatorial genetic algorithm (IBCGA). The feature selection of Auto-NIFI allows biologists to construct a suitable classifier for particular neuron image classification problems. To identify neuron image features, Auto-NIFI provides a comprehensive set of image feature extraction modules together with the IBCGA feature selection modules. Notably, according to the huge collection of image feature extraction modules available in this tool, this system is also capable of applying to a wide variety of biological image classification problems. Two methods, HCS-Neurons and DescNeuro, are proposed for neuron image classification. In the HCS-Neurons method, the usefulness of Auto-NIFI is demonstrated in identifying phenotypic changes in multi-neuron images upon response to drug treatments of high-content screening. The identified three features of morphology were able to achieve an independent accuracy of 90.28% for recognizing neurons into six classes corresponding to six different nocodazole drug concentrations. By using the Auto-NIFI, DescNeuro can recognize a neuron in the 3D Drosophila neuron database from a 2D image with promising recognition results.	zh_TW
dc.description.abstract	The massive growth of protein sequence and neuron image datasets leads to the need of computation-based methods to predict and analyse their biological functions. To predict protein functions and recognize neurons images, machine-learning-based classifiers are regularly suggested. In present, the desired predictor of protein functions should provide both prediction efficiency and knowledge discovery. Meanwhile, the identification of informative features for recognizing neuron images is not easy due to a large number of available image features. This dissertation develops optimization methodologies for both predicting protein sequences and recognizing neuron images based on an intelligent genetic algorithm (IGA). The scoring card method (SCM) is a simple and highly interpretable method for prediction and analysis of protein functions. The SCM calculates dipeptides propensity scores of an interested protein function from the difference of dipeptide compositions between positive and negative sequences. The propensity scores of 400 dipeptides are optimized by IGA to enhance prediction accuracy while conserving the original characteristics of amino acid composition. A sequence score is derived by utilizing these propensity scores to predict its protein function. Two SCM-based methods, SCMSOL and SCMCRYS, are proposed for prediction and analysis of protein solubility and crystallizability, and their tests accuracies are 84.3% and 76.1%, respectively, which are comparable to the support vector machine based methods using the same dipeptide composition features. Moreover, the biological knowledge discovery and mutagenesis analysis for soluble and crystallizable proteins from the propensity scores are illustrated. The procedure of developing SCM-based methods for protein function prediction can also be applied to design other methods for predicting protein functions with high prediction performance and high interpretable results. This dissertation also presents an automated neuron image feature identification system (Auto-NIFI) which is a user-friendly tool for automatically extracting and identifying a small set of informative neuron image features utilizing an inheritable bi-objective combinatorial genetic algorithm (IBCGA). The feature selection of Auto-NIFI allows biologists to construct a suitable classifier for particular neuron image classification problems. To identify neuron image features, Auto-NIFI provides a comprehensive set of image feature extraction modules together with the IBCGA feature selection modules. Notably, according to the huge collection of image feature extraction modules available in this tool, this system is also capable of applying to a wide variety of biological image classification problems. Two methods, HCS-Neurons and DescNeuro, are proposed for neuron image classification. In the HCS-Neurons method, the usefulness of Auto-NIFI is demonstrated in identifying phenotypic changes in multi-neuron images upon response to drug treatments of high-content screening. The identified three features of morphology were able to achieve an independent accuracy of 90.28% for recognizing neurons into six classes corresponding to six different nocodazole drug concentrations. By using the Auto-NIFI, DescNeuro can recognize a neuron in the 3D Drosophila neuron database from a 2D image with promising recognition results.	en_US
dc.language.iso	en_US	en_US
dc.subject	預測蛋白質	zh_TW
dc.subject	辨識神經細胞影像	zh_TW
dc.subject	特徵擷取	zh_TW
dc.subject	特徵選取	zh_TW
dc.subject	最佳化	zh_TW
dc.subject	Protein prediction	en_US
dc.subject	Neuron image classification	en_US
dc.subject	Feature extraction	en_US
dc.subject	Feature selection	en_US
dc.subject	Optimization	en_US
dc.title	設計最佳化演算法預測蛋白質功能和辨認神經細胞影像	zh_TW
dc.title	Designing Optimization Methods to Predict Protein Functions and Recognize Neuron Images	en_US
dc.type	Thesis	en_US
dc.contributor.department	生物資訊及系統生物研究所	zh_TW
顯示於類別：	畢業論文