On the Common Substring Alignment Problem期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

On the Common Substring Alignment Problem

Authors:	Gad M Landau Michal Ziv-Ukelson

Institution:	^a Department of Computer Science, Haifa University, Haifa, 31905, Israel;Department of Computer and Information Science, Polytechnic University, Six MetroTech Center, Brooklyn, New York, 11201-3840, ^f1;IBM T. J. W. Research Center, Yorktown Heights, New York, 10598, ^,^f2

Abstract:	The Common Substring Alignment Problem is defined as follows: Given a set of one or more strings S₁, S₂ … S_c and a target string T, Y is a common substring of all strings S_i, that is, S_i = B_iYF_i. The goal is to compute the similarity of all strings S_i with T, without computing the part of Y again and again. Using the classical dynamic programming tables, each appearance of Y in a source string would require the computation of all the values in a dynamic programming table of size O(nℓ) where ℓ is the size of Y. Here we describe an algorithm which is composed of an encoding stage and an alignment stage. During the first stage, a data structure is constructed which encodes the comparison of Y with T. Then, during the alignment stage, for each comparison of a source S_i with T, the pre-compiled data structure is used to speed up the part of Y. We show how to reduce the O(nℓ) alignment work, for each appearance of the common substring Y in a source string, to O(n)-at the cost of O(nℓ) encoding work, which is executed only once.

Keywords:	design and analysis of algorithms dynamic programming sequence comparison repeated substrings shared substrings Monge arrays candidate lists
本文献已被 ScienceDirect 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏