首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Fast pattern-matching on indeterminate strings
Authors:Jan Holub  WF Smyth  Shu Wang  
Institution:aDepartment of Computer Science & Engineering, Czech Technical University, Karlovo náměstí 13, Praha 2, CZ-121 35, Czech Republic;bAlgorithms Research Group, Department of Computing & Software, McMaster University, Hamilton, ON L8S 4K1, Canada;cDepartment of Computing, Curtin University, GPO Box U1987, Perth, WA 6845, Australia
Abstract:In a string x on an alphabet Σ, a position i is said to be indeterminate iff xi] may be any one of a specified subset {λ1,λ2,…,λj} of Σ, 2less-than-or-equals, slantjless-than-or-equals, slant|Σ|. A string x containing indeterminate positions is therefore also said to be indeterminate. Indeterminate strings can arise in DNA and amino acid sequences as well as in cryptological applications and the analysis of musical texts. In this paper we describe fast algorithms for finding all occurrences of a pattern p=p1..m] in a given text x=x1..n], where either or both of p and x can be indeterminate. Our algorithms are based on the Sunday variant of the Boyer–Moore pattern-matching algorithm, one of the fastest exact pattern-matching algorithms known. The methodology we describe applies more generally to all variants of Boyer–Moore (such as Horspool's, for example) that depend only on calculation of the δ (“rightmost shift”) array: our method therefore assumes that Σ is indexed (essentially, an integer alphabet), a requirement normally satisfied in practice.
Keywords:String  Word  Algorithm  Pattern-matching  Indeterminate  Approximate  Boyer–  Moore  Sunday
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号