標題: Efficient algorithms for regular expression constrained sequence alignment
作者: Chung, Yun-Sheng
Lu, Chin Lung
Tang, Chuan Yi
生物科技學系
Department of Biological Science and Technology
關鍵字: constrained sequence alignment;algorithms;regular expression;dynamic programming
公開日期: 15-九月-2007
摘要: Imposing constraints is an effective means to incorporate biological knowledge into alignment procedures. As in the PROSITE database, functional sites of proteins can be effectively described as regular expressions. In an alignment of protein sequences it is natural to expect that functional motifs should be aligned together. Due to this motivation, Arslan introduced the regular expression constrained sequence alignment problem and proposed an algorithm which, if implemented naively, can take time and space up to O(vertical bar Sigma vertical bar V-2 vertical bar vertical bar(4)n(2)) and O(vertical bar Sigma vertical bar(2)vertical bar V vertical bar(4)n), respectively, where Sigma is the alphabet, n is the sequence length, and V is the set of states in an automaton equivalent to the input regular expression. In this paper we propose a more efficient algorithm solving this problem which takes O(vertical bar V vertical bar(3)n(2)) time and O(vertical bar V vertical bar(2n)) space in the worst case. If vertical bar V vertical bar = O(log n) we propose another algorithm with time complexity O(vertical bar V vertical bar(2)log vertical bar V vertical bar n(2)). The time complexity of our algorithms is independent of Sigma, which is desirable in protein applications where the formulation of this problem originates; a factor of vertical bar Sigma vertical bar(2) = 400 in the time complexity of the previously proposed algorithm would significantly affect the efficiency in practice. (c) 2007 Elsevier B.V All rights reserved.
URI: http://dx.doi.org/10.1016/j.ipl.2007.04.007
http://hdl.handle.net/11536/4085
ISSN: 0020-0190
DOI: 10.1016/j.ipl.2007.04.007
期刊: INFORMATION PROCESSING LETTERS
Volume: 103
Issue: 6
起始頁: 240
結束頁: 246
顯示於類別:會議論文


文件中的檔案:

  1. 000248490700006.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。