首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 203 毫秒
1.
Given two strings, the longest common subsequence (LCS) problem consists in computing the length of the longest string that is a subsequence of both input strings. Its generalisation, the all semi-local LCS problem, requires computing the LCS length for each string against all substrings of the other string, and for all prefixes of each string against all suffixes of the other string. We survey a number of algorithmic techniques related to the all semi-local LCS problem. We then present a number of algorithmic applications of these techniques, both existing and new. In particular, we obtain a new all semi-local LCS algorithm, with asymptotic running time matching (in the case of an unbounded alphabet) the fastest known global LCS algorithm by Masek and Paterson. We conclude that semi-local string comparison turns out to be a useful algorithmic plug-in, which unifies, and often improves on, a number of previous approaches to various substring- and subsequence-related problems. The author acknowledges the support of The University of Warwick’s DIMAP (the Centre for Discrete Mathematics and its Applications) during this work.  相似文献   

2.
Computing the longest common subsequence of two sequences is one of the most studied algorithmic problems. In this work we focus on a particular variant of the problem, called repetition free longest common subsequence (RF-LCSRF-LCS), which has been proved to be NP-hard. We propose a hybrid genetic algorithm, which combines standard genetic algorithms and estimation of distribution algorithms, to solve this problem. An experimental comparison with some well-known approximation algorithms shows the suitability of the proposed technique.  相似文献   

3.
We consider two generalizations of the longest common subsequence (LCS) problem: the Set LCS problem and the Set-Set LCS problem. We present algorithms for the two problems that are faster than the previous ones by Hirschberg and Larmore.  相似文献   

4.
The subsequence matching problem is to decide, for given strings S and T, whether S is a subsequence of T. The string S is called the pattern and the string T the text. We consider the case of multiple texts and show how to solve the subsequence matching problem in time linear in the length of the pattern. For this purpose we build an automaton that accepts all subsequences of given texts. This automaton is called the Directed Acyclic Subsequence Graph (DASG). We prove an upper bound for its number of states. Furthermore, we consider a modification of the subsequence matching problem: given a string S and a finite language L, we are to decide whether S is a subsequence of any string in L. We suppose that a finite automaton accepting L is given and present an algorithm for building the DASG for language L. We also mention applications of the DASG to some problems related to subsequences.  相似文献   

5.
In this short note we prove a concentration result for the length of the longest increasing subsequence (LIS) of a randomly and uniformly chosen involution of {1,…,s}.  相似文献   

6.
The problem of the number p(n , r), (1 ?r?n), of permutations on the set {1,…,n} with longest ascending subsequence of given length r is considered. By placing further restrictions on the ascending subsequence, combinatorial identities are obtained which allow the explicit calculation of p(n , r) in some cases.  相似文献   

7.
Recently, Stanley [Longest alternating subsequences of permutations, preprint, arXiv/0511419v1] studied the length of the longest alternating subsequence of a permutation in the symmetric group, where a sequence a,b,c,d,… is alternating if a>b<c>d<?. In this paper, we extend this result to the case of k-ary words. More precisely, we find an explicit formula for the generating function of the number of k-ary words of length n according to the length of the longest alternating subsequence.  相似文献   

8.
We consider the distribution of the length of the longest subsequence avoiding an arbitrary pattern, π, in a random permutation of length n. The well‐studied case of a longest increasing subsequence corresponds to π = 21. We show that there is some constant cπ such that as n →∞ the mean value of this length is asymptotic to and that the distribution of the length is tightly concentrated around its mean. We observe some apparent connections between cπ and the Stanley–Wilf limit of the class of permutations avoiding the pattern π. © 2006 Wiley Periodicals, Inc. Random Struct. Alg., 2007  相似文献   

9.
We explore a question related to the celebrated Erd?s-Szekeres Theorem and develop a geometric approach to answer it. Our main object of study is the Erd?s-Szekeres Tableau, or EST, of a number sequence. An EST is the sequence of integral points whose coordinates record the length of the longest increasing and longest decreasing subsequence ending at each element of the sequence. We define the Order Poset of an EST in order to answer the question: What information about the sequence can be recovered by its EST?  相似文献   

10.
We study the problem of, given two finite sequences x and y, finding a repetition-free longest common subsequence of x and y. We show some algorithmic results, a complexity result, and a preliminary experimental study based on the proposed algorithms.  相似文献   

11.
Let l(n) be the expected length of the longest unimodal subsequence of a random permutation. It is proved here that l(n)?√n converges to 2√2.This settles a conjecture of F.R.K. Chung.  相似文献   

12.
We study the following problem. Given two sequences x and y over a finite alphabet, find a repetition-free longest common subsequence of x and y. We show several algorithmic results, a computational complexity result, and we describe a preliminary experimental study based on the proposed algorithms. We also show that this problem is APX-hard.  相似文献   

13.
The string merging problem is to determine a merged string from a given set of strings. The distinguishing property of a solution is that the total cost of editing all of the given strings into this solution is minimal. Necessary and sufficient conditions are presented for the case where this solution matches the solution to the string-to-string correction problem. A special case where deletion is the only allowed edition operation is shown to have the longest common subsequence of the strings as its solution.This research was supported by the U.S. Army Research Office.  相似文献   

14.
We derive a moderate deviation principle for the lower tail probabilities of the length of a longest increasing subsequence in a random permutation. It refers to the regime between the lower tail large deviation regime and the central limit regime. The present article together with the upper tail moderate deviation principle in Ref. 12 yields a complete picture for the whole moderate deviation regime. Other than in Ref. 12, we can directly apply estimates by Baik, Deift, and Johansson, who obtained a (non-standard) Central Limit Theorem for the same quantity.  相似文献   

15.
We introduce a novel alphabet sampling technique for speeding up both online and indexed string matching. We choose a subset of the alphabet and extract the corresponding subsequence of the text. Online or indexed searching is then carried out on the extracted subsequence, and candidate matches are verified in the full text. We show that this speeds up online searching, especially for moderate to long patterns, by a factor of up to 5, while using 14% extra space in our experiments. For indexed searching we achieve indexes that are as fast as the classical suffix array, yet occupy less than 50% extra space (instead of the usual 400%). Our experiments show no competitive alternatives exist in a wide space/time range.  相似文献   

16.
We present algorithms for finding a longest common increasing subsequence of two or more input sequences. For two sequences of lengths n and m, where m?n, we present an algorithm with an output-dependent expected running time of and O(m) space, where ? is the length of an LCIS, σ is the size of the alphabet, and Sort is the time to sort each input sequence. For k?3 length-n sequences we present an algorithm which improves the previous best bound by more than a factor k for many inputs. In both cases, our algorithms are conceptually quite simple but rely on existing sophisticated data structures. Finally, we introduce the problem of longest common weakly-increasing (or non-decreasing) subsequences (LCWIS), for which we present an -time algorithm for the 3-letter alphabet case. For the extensively studied longest common subsequence problem, comparable speedups have not been achieved for small alphabets.  相似文献   

17.
Let f(n) be the expected length of a longest common subsequence of two random binary sequences of length n. It is known that the limit γ = limn→ ∞ n?1 f(n) exists. Improved upper bounds for γ are given using a new method.  相似文献   

18.
The problem of sequence comparison via optimal alignments occurs naturally in many areas of applications. The simplest such technique is based on evaluating a score given by the length of a longest common subsequence divided by the average length of the original sequences. In this paper we investigate the expected value of this score when the input sequences are random and their length tends to infinity. The corresponding limit exists but is not known precisely. We derive a theoretical large deviation, convex analysis and Monte Carlo based method to compute a consistent sequence of upper bounds on the unknown limit. An empirical practical version of our method produces promising numerical results.  相似文献   

19.
This paper establishes new restrictions for attainable enhanced principal rank characteristic sequences (epr-sequences). These results are then used to classify two related families of sequences that are attainable by a real symmetric matrix: the family of principal rank characteristic sequences (pr-sequences) not containing three consecutive 1s and the family of epr-sequences which contains an N in every subsequence of length 3.  相似文献   

20.
The interplay between two-dimensional percolation growth models and one-dimensional particle processes has been a fruitful source of interesting mathematical phenomena. In this paper we develop a connection between the construction of Busemann functions in the Hammersley last-passage percolation model with i.i.d. random weights, and the existence, ergodicity and uniqueness of equilibrium (or time-invariant) measures for the related (multi-class) interacting fluid system. As we shall see, in the classical Hammersley model, where each point has weight one, this approach brings a new and rather geometrical solution of the longest increasing subsequence problem, as well as a central limit theorem for the Busemann function.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号