Full metadata record
DC FieldValueLanguage
dc.contributor.author李威勳en_US
dc.contributor.authorWei-Hsun Leeen_US
dc.contributor.author盧錦隆en_US
dc.contributor.authorChin Lung Luen_US
dc.date.accessioned2014-12-12T03:09:24Z-
dc.date.available2014-12-12T03:09:24Z-
dc.date.issued2006en_US
dc.identifier.urihttp://140.113.39.130/cdrfb3/record/nctu/#GT009451511en_US
dc.identifier.urihttp://hdl.handle.net/11536/82003-
dc.description.abstract在生物資訊及計算生物領域中,多重序列比對 (Multiple Sequences Alignment) 在發掘基因體或蛋白質序列的生物意義上是很有用的工具。通常生物學家對序列的結構╱功能╱演化關係已有一些初步的認識,如活化部位的殘基、分子間的雙硫鍵、受質結合的部位、酵素的活化性及保守性的 Motifs 等等。因此在做多重序列比對時,生物學家希望有一個工具能讓一些結構性的╱功能性的╱保留性的核甘酸或殘基可以排在一起。 2004年我們的研究團隊已開發出一套限制型多重序列比對 (Multiple Sequence Alignment with Constraints) 的工具叫 MuSiC。至目前為止 MuSiC 已被許多生物學家證實在生物的研究上是相當有用的。然而,MuSiC 中的 Constraint 只能是允許 Mismatches 但不能允許Gap的簡單序列片段。很多生物重要的 Patterns 像是 PROSITE database 中的 Motifs 在 MuSiC 中是無法使用的。因此,在此論文中我們主要的目的為研究並開發出一套能夠使用正規表示式的限制型多重序列比對 (Multiple Sequence Alignment with Regular Expression Constraints) 的演算法與工具。 我們採用了 Progressive 的方法來解決正規表示式的限制型多重序列比對的問題。事實上,這個方法的核心在於設計出有效率的演算法來解決正規表示式的限制型兩條序列比對問題 (Pairwise Sequence Alignment with Regular Expression Constraints Problem)。我們是將正規表示式 (Regular Expression) 轉成有限狀態機 (Finite Automaton),並使用 Dynamic Programming 與 Divide-and-Conquer 方法來設計一個在時間與空間上都有效率的演算法來求得最佳化的正規表示式限制型兩條序列比對。然後,我們再跟據此演算法發展出能夠使用多個正規表示式的限制型多重序列比對工具:RE-MuSiC (Multiple Sequence Alignment with Regular Expression Constraints),其網址在 http://140.113.239.131/RE-MUSICzh_TW
dc.description.abstractMultiple sequence alignment (MSA) has received much attention in the fields of bioinformatics and computational biology because it is very useful for discovering the biological meanings of sequences. Usually, biologists may have advanced knowledge about the structural, functional, and /or evolutionary relationships about sequences of their interest, such as active site residues, intramolecular disulfide bonds, substrate binding sites, enzyme activities, conserved motifs (consensuses) and so on. They would expect an MSA program that is able to align these sequences such that the structural, functional, and/or conserved bases (i.e., nucleotides or residues) are aligned together. In 2004, our research group has already developed a tool, called MuSiC, for multiple sequence alignment with constraint. Since then, it has been proven by many biologists to be useful in biological research. Nevertheless, the constraints allowed in MuSiC can only be simple substrings allowing mismatches but disallowing gaps in the occurrences. Many biologically important patterns such as motifs in the PROSITE database cannot be supported by MuSiC, either. Hence, in this thesis, we study and develop an algorithm and a tool for the problem of multiple sequence alignment with regular expression constraints (RECMSA). We used a progressive approach to design an efficient program for solving the RECMSA problem. The kernel of this approach is an efficient algorithm for solving the problem of pairwise sequence alignment with regular expression constraints (RECPSA). We transform the regular expressions into a finite automaton and then use dynamic programming and divide-and-conquer approaches to develop a time and space efficient algorithm for optimally solving the RECPSA problem, which can be implemented effectively on a desktop PC with limited memory. Using this algorithm as the kernel, we developed a web-server called RE-MuSiC (Multiple Sequence Alignment with Regular Expression Constraints) that is available on-line at http://140.113.239.131/RE-MUSIC.en_US
dc.language.isoen_USen_US
dc.subject限制型序列比對zh_TW
dc.subject正規化表示式zh_TW
dc.subjectPROSITE 資料庫zh_TW
dc.subjectsequence alignment with constraintsen_US
dc.subjectregular expressionen_US
dc.subjectPROSITE databaseen_US
dc.title正規化表示式的限制型多重序列比對之研究zh_TW
dc.titleOn the Study of Multiple Sequence Alignment with Regular Expression Constraintsen_US
dc.typeThesisen_US
dc.contributor.department生物資訊及系統生物研究所zh_TW
Appears in Collections:Thesis


Files in This Item:

  1. 151101.pdf
  2. 151101.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.