首页 | 本学科首页   官方微博 | 高级检索  
     检索      


On the Common Substring Alignment Problem
Authors:Gad M Landau  Michal Ziv-Ukelson  
Institution:a Department of Computer Science, Haifa University, Haifa, 31905, Israel;Department of Computer and Information Science, Polytechnic University, Six MetroTech Center, Brooklyn, New York, 11201-3840, f1;IBM T. J. W. Research Center, Yorktown Heights, New York, 10598, , f2
Abstract:The Common Substring Alignment Problem is defined as follows: Given a set of one or more strings S1S2 … Sc and a target string T, Y is a common substring of all strings Si, that is, Si = BiYFi. The goal is to compute the similarity of all strings Si with T, without computing the part of Y again and again. Using the classical dynamic programming tables, each appearance of Y in a source string would require the computation of all the values in a dynamic programming table of size O(nℓ) where ℓ is the size of Y. Here we describe an algorithm which is composed of an encoding stage and an alignment stage. During the first stage, a data structure is constructed which encodes the comparison of Y with T. Then, during the alignment stage, for each comparison of a source Si with T, the pre-compiled data structure is used to speed up the part of Y. We show how to reduce the O(nℓ) alignment work, for each appearance of the common substring Y in a source string, to O(n)-at the cost of O(nℓ) encoding work, which is executed only once.
Keywords:design and analysis of algorithms  dynamic programming  sequence comparison  repeated substrings  shared substrings  Monge arrays  candidate lists
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号