Protein subcellular localization prediction based on compartment-specific features and structure conservation

doi:10.1186/1471-2105-8-330

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.author	Su, Emily Chia-Yu	en_US
dc.contributor.author	Chiu, Hua-Sheng	en_US
dc.contributor.author	Lo, Allan	en_US
dc.contributor.author	Hwang, Jenn-Kang	en_US
dc.contributor.author	Sung, Ting-Yi	en_US
dc.contributor.author	Hsu, Wen-Lian	en_US
dc.date.accessioned	2014-12-08T15:13:21Z	-
dc.date.available	2014-12-08T15:13:21Z	-
dc.date.issued	2007-09-08	en_US
dc.identifier.issn	1471-2105	en_US
dc.identifier.uri	http://dx.doi.org/10.1186/1471-2105-8-330	en_US
dc.identifier.uri	http://hdl.handle.net/11536/10337	-
dc.description.abstract	Background: Protein subcellular localization is crucial for genome annotation, protein function prediction, and drug discovery. Determination of subcellular localization using experimental approaches is time-consuming; thus, computational approaches become highly desirable. Extensive studies of localization prediction have led to the development of several methods including composition-based and homology-based methods. However, their performance might be significantly degraded if homologous sequences are not detected. Moreover, methods that integrate various features could suffer from the problem of low coverage in high-throughput proteomic analyses due to the lack of information to characterize unknown proteins. Results: We propose a hybrid prediction method for Gram-negative bacteria that combines a one-versus-one support vector machines ( SVM) model and a structural homology approach. The SVM model comprises a number of binary classifiers, in which biological features derived from Gram-negative bacteria translocation pathways are incorporated. In the structural homology approach, we employ secondary structure alignment for structural similarity comparison and assign the known localization of the top-ranked protein as the predicted localization of a query protein. The hybrid method achieves overall accuracy of 93.7% and 93.2% using ten-fold cross-validation on the benchmark data sets. In the assessment of the evaluation data sets, our method also attains accurate prediction accuracy of 84.0%, especially when testing on sequences with a low level of homology to the training data. A three-way data split procedure is also incorporated to prevent overestimation of the predictive performance. In addition, we show that the prediction accuracy should be approximately 85% for non-redundant data sets of sequence identity less than 30%. Conclusion: Our results demonstrate that biological features derived from Gram-negative bacteria translocation pathways yield a significant improvement. The biological features are interpretable and can be applied in advanced analyses and experimental designs. Moreover, the overall accuracy of combining the structural homology approach is further improved, which suggests that structural conservation could be a useful indicator for inferring localization in addition to sequence homology. The proposed method can be used in large-scale analyses of proteomes.	en_US
dc.language.iso	en_US	en_US
dc.title	Protein subcellular localization prediction based on compartment-specific features and structure conservation	en_US
dc.type	Article	en_US
dc.identifier.doi	10.1186/1471-2105-8-330	en_US
dc.identifier.journal	BMC BIOINFORMATICS	en_US
dc.citation.volume	8	en_US
dc.citation.issue		en_US
dc.citation.epage		en_US
dc.contributor.department	生物資訊及系統生物研究所	zh_TW
dc.contributor.department	Institude of Bioinformatics and Systems Biology	en_US
dc.identifier.wosnumber	WOS:000250596000001	-
dc.citation.woscount	26	-
顯示於類別：	期刊論文

文件中的檔案：

000250596000001.pdf

若為 zip 檔案，請下載檔案解壓縮後，用瀏覽器開啟資料夾中的 index.html 瀏覽全文。