標題: 建立蛋白質資料庫與知識庫
Construction and Implementation of Protein Database and Knowledge Base
作者: 顧世彥
Shih-yen Ku
胡毓志
楊維邦
Yuh-Jyu Hu
Wei-Pong Yang
資訊科學與工程研究所
關鍵字: 蛋白質;資料庫;知識庫;資料採礦;結構分析;字母表述;kmean分群法;SOM分群法;protein;SOM;kmeans;knowledge base;datamining;database;structural alphabets
公開日期: 2004
摘要: 本篇論文最主要的是提供一個範例,這個範例是建構我們自己特有的蛋白質資料庫,並且發展我們自己一套資料採礦的方法去建構出我們自己特有的蛋白質知識庫.在本篇論文裡,我們利用我們發展的一套組合式方法(SUM-K)去找出蛋白質的基本結構並將其轉換成一套足以代表蛋白質結構特性的字母系統.利用這樣具有結構特性的字母系統,我們可以下去進行結構相似度分析,並且搭配利用1D排比的工具,如此可以快速的比對出結構相似度高的蛋白質.我們也針對SCOP 蛋白質資料做了一系列的實驗,實驗驗證了我們字母系統優於其他字母統且我們所提出的方法(SUM-K)不但可行而且可以找到最能代表蛋白質結構的結構字母轉換系統.我們也將轉好的字母系統存到了知識庫中,另外我們也提供了網路介面給使用者來分析自己有興趣的蛋白質.
The purpose of this thesis is providing an example of constructing our protein database and developing the combinatorial data mining approach to construct our protein knowledge base. In this thesis, the combinatorial approach (SUM-K) found the basic building blocks of protein structure and defined the structure alphabet (SA). The structure alphabet can represent the structural information of protein and transform the original sequences into sequences of structure alphabet with near-neighborhood assignments. The transformed sequences can be measured the similarity of protein structures with 1D alignment tools and fast found high structural similarity one. We took the proteins of SCOP database and do the serial experiment. The results have shown that our combinatorial approach (SUM-K) can define the more proper structure alphabet system than the others. Finally, the transformed sequences of proteins have been saved into our protein knowledge base. Besides, the web-based analytical interface have been set up and provided users to analyze the proteins they interest in.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT009223622
http://hdl.handle.net/11536/76672
顯示於類別:畢業論文


文件中的檔案:

  1. 362201.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。