標題: Efficient algorithms for regular expression constrained sequence alignment
作者: Chung, Yun-Sheng
Lu, Chin Lung
Tang, Chuan Yi
生物科技學系
Department of Biological Science and Technology
公開日期: 2006
摘要: Imposing constraints is an effective means to incorporate biological knowledge into alignment procedures. As in the PROSITE database, functional sites of proteins can be effectively described as regular expressions. In an alignment of protein sequences it is natural to expect that functional motifs should be aligned together. Due to this motivation, in CPM 2005 Arslan introduced the regular expression constrained sequence alignment problem and proposed an algorithm which can take time and space up to O(vertical bar Sigma vertical bar(2) vertical bar V vertical bar(4) n(2)) and O(vertical bar Sigma vertical bar(2) vertical bar V vertical bar(4)n), respectively, where Sigma is the alphabet, n is the sequence length, and V is the set of states in an automaton equivalent to the input regular expression. In this paper we propose a more efficient algorithm solving this problem which takes O(vertical bar V vertical bar(3) n(2)) time and O(vertical bar V vertical bar(2) n) space in the worst case. If vertical bar V vertical bar = O(log n) we propose another algorithm with time complexity O(vertical bar V vertical bar(2) log vertical bar V vertical bar n(2)). The time complexity of our algorithms is independent of Sigma, which is desirable in protein applications where the formulation of this problem originates; a factor of vertical bar Sigma vertical bar(2) = 400 in the time complexity of the previously proposed algorithm would significantly affect the efficiency in practice.
URI: http://hdl.handle.net/11536/12906
ISBN: 3-540-35455-7
ISSN: 0302-9743
期刊: COMBINATORIAL PATTERN MATCHING, PROCEEDINGS
Volume: 4009
起始頁: 389
結束頁: 400
顯示於類別:會議論文