首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper, we generalize the inclusion constrained longest common subsequence (CLCS) problem to the hybrid CLCS problem which is the combination of the sequence inclusion CLCS and the string inclusion CLCS, called the sequential substring constrained longest common subsequence   (SSCLCS) problem. In the SSCLCS problem, we are given two strings AA and BB of lengths mm and nn, respectively, formed by alphabet ΣΣ and a constraint sequence CC formed by ordered strings (C1,C2,C3,…,Cl)(C1,C2,C3,,Cl) with total length rr. The problem is that of finding the longest common subsequence DD of AA and BB containing C1,C2,C3,…,ClC1,C2,C3,,Cl as substrings and with the order of the CC’s retained. This problem has two variants, depending on whether the strings in CC cannot overlap or may overlap. We propose algorithms with O(mnl+(m+n)(|Σ|+r))O(mnl+(m+n)(|Σ|+r)) and O(mnr+(m+n)|Σ|)O(mnr+(m+n)|Σ|) time for the two variants. For the special case with one or two constraints, our algorithm runs in O(mn+(m+n)(|Σ|+r))O(mn+(m+n)(|Σ|+r)) or O(mnr+(m+n)|Σ|)O(mnr+(m+n)|Σ|) time, respectively—an order faster than the algorithm proposed by Chen and Chao.  相似文献   

2.
Arc-annotated sequences are useful in representing the structural information of RNA and protein sequences. The

problem has recently been introduced in [P.A. Evans, Algorithms and complexity for annotated sequence analysis, PhD Thesis, University of Victoria, 1999; P.A. Evans, Finding common subsequences with arcs and pseudoknots, in: Proceedings of 10th Annual Symposium on Combinatorial Pattern Matching (CPM'99), in: Lecture Notes in Comput. Sci., vol. 1645, 1999, pp. 270–280] as a framework for studying the similarity of arc-annotated sequences. In this paper, we consider arc-annotated sequences with various arc structures and present some new algorithmic and complexity results on the problem. Some of our results answer an open question in [P.A. Evans, Algorithms and complexity for annotated sequence analysis, PhD Thesis, University of Victoria, 1999; P.A. Evans, Finding common subsequences with arcs and pseudoknots, in: Proceedings of 10th Annual Symposium on Combinatorial Pattern Matching (CPM'99), in: Lecture Notes in Comput. Sci., vol. 1645, 1999, pp. 270–280] and some others improve the hardness results in [P.A. Evans, Algorithms and complexity for annotated sequence analysis, PhD Thesis, University of Victoria, 1999; P.A. Evans, Finding common subsequences with arcs and pseudoknots, in: Proceedings of 10th Annual Symposium on Combinatorial Pattern Matching (CPM'99), in Lecture Notes in Comput. Sci., vol. 1645, 1999, pp. 270–280].  相似文献   

3.
4.
We consider the length L of the longest common subsequence of two randomly uniformly and independently chosen n character words over a k-ary alphabet. Subadditivity arguments yield that E[L]/n converges to a constant γk. We prove a conjecture of Sankoff and Mainville from the early 1980s claiming that as k→∞.  相似文献   

5.
Computing the longest common subsequence of two sequences is one of the most studied algorithmic problems. In this work we focus on a particular variant of the problem, called repetition free longest common subsequence (RF-LCSRF-LCS), which has been proved to be NP-hard. We propose a hybrid genetic algorithm, which combines standard genetic algorithms and estimation of distribution algorithms, to solve this problem. An experimental comparison with some well-known approximation algorithms shows the suitability of the proposed technique.  相似文献   

6.
Given a set of strings U={T1,T2,…,T}, the longest common repeat problem is to find the longest common substring that appears at least twice in each string of U. We also consider reversed and reverse-complemented repeats as well as normal repeats. We present a linear time algorithm for the longest common repeat problem.  相似文献   

7.
Given a set S={S 1,…,S k } of finite strings, the k-Longest Common Subsequence Problem (k-LCSP) seeks a string L * of maximum length such that L * is a subsequence of each S i for i=1,…,k. This paper presents a large neighborhood search technique that provides quality solutions to large k-LCSP instances. This heuristic runs in linear time in both the length of the sequences and the number of sequences. Some computational results are provided.  相似文献   

8.
This paper deals with an NP-hard string problem from the bio-informatics field: the repetition-free longest common subsequence problem. This problem has enjoyed an increasing interest in recent years, which has resulted in the application of several pure as well as hybrid metaheuristics. However, the literature lacks a comprehensive comparison between those approaches. Moreover, it has been shown that general purpose integer linear programming solvers are very efficient for solving many of the problem instances that were used so far in the literature. Therefore, in this work we extend the available benchmark set, adding larger instances to which integer linear programming solvers cannot be applied anymore. Moreover, we provide a comprehensive comparison of the approaches found in the literature. Based on the results we propose a hybrid between two of the best methods which turns out to inherit the complementary strengths of both methods.  相似文献   

9.
Let f(n) be the expected length of a longest common subsequence of two random binary sequences of length n. It is known that the limit γ = limn→ ∞ n?1 f(n) exists. Improved upper bounds for γ are given using a new method.  相似文献   

10.
In this short note we prove a concentration result for the length of the longest increasing subsequence (LIS) of a randomly and uniformly chosen involution of {1,…,s}.  相似文献   

11.
In [2], Chvatal provided the tight worst case bound of the set covering greedy heuristic. We considered a general class of greedy type set covering heuristics. Their worst case bounds are dominated by that of the greedy heuristic.  相似文献   

12.
Heuristics for the fixed cost median problem   总被引:4,自引:0,他引:4  
We describe in this paper polynomial heuristics for three important hard problems—the discrete fixed cost median problem (the plant location problem), the continuous fixed cost median problem in a Euclidean space, and the network fixed cost median problem with convex costs. The heuristics for all the three problems guarantee error ratios no worse than the logarithm of the number of customer points. The derivation of the heuristics is based on the presentation of all types of median problems discussed as a set covering problem.  相似文献   

13.
For a string A=a1an, a reversalρ(i,j), 1?i?j?n, transforms the string A into a string A=a1ai-1ajaj-1aiaj+1an, that is, the reversal ρ(i,j) reverses the order of symbols in the substring aiaj of A. In the case of signed strings, where each symbol is given a sign + or -, the reversal operation also flips the sign of each symbol in the reversed substring. Given two strings, A and B, signed or unsigned, sorting by reversals (SBR) is the problem of finding the minimum number of reversals that transform the string A into the string B.Traditionally, the problem was studied for permutations, that is, for strings in which every symbol appears exactly once. We consider a generalization of the problem, k-SBR, and allow each symbol to appear at most k times in each string, for some k?1. The main result of the paper is an O(k2)-approximation algorithm running in time O(n). For instances with , this is the best known approximation algorithm for k-SBR and, moreover, it is faster than the previous best approximation algorithm.  相似文献   

14.
15.
Hashing is a widely used technique for data organization. Hash tables enable a fast search of the stored data and are used in a variety of applications ranging from software to network equipment and computer hardware. One of the main issues associated with hashing are collisions that cause an increase in the search time. A number of alternatives have been proposed to deal with collisions. One of them is separate chaining, in which for each hash value an independent list of the elements that have that value is stored. In this scenario, the worst case search time is given by the maximum length of that list across all hash values. This worst case is often referred to as Longest Length Probe Sequence (llps) in the literature. Approximations for the expected longest length probe sequence when the hash table is large have been proposed and an exact analytical solution has also been presented in terms of a set of recurring equations. In this paper, a novel analytical formulation of the expected longest length probe sequence is introduced. The new formulation does not require a recursive computation and can be easily implemented in a symbolic computation tool.  相似文献   

16.
In this paper we address the Distance-Constrained Capacitated Vehicle Routing Problem (DCVRP), where k minimum-cost routes through a central depot have to be constructed so as to cover all customers while satisfying, for each route, both a capacity and a total-distance-travelled limit. Our starting point is the following refinement procedure proposed in 1981 by Sarvanov and Doroshko for the pure Travelling Salesman Problem (TSP): given a starting tour, (a) remove all the nodes in even position, thus leaving an equal number of ``empty holes' in the tour; (b) optimally re-assign the removed nodes to the empty holes through the efficient solution of a min-sum assignment (weighted bipartite matching) problem. We first extend the Sarvanov-Doroshko method to DCVRP, and then generalize it. Our generalization involves a procedure to generate a large number of new sequences through the extracted nodes, as well as a more sophisticated ILP model for the reallocation of some of these sequences. An important feature of our method is that it does not rely on any specialized ILP code, as any general-purpose ILP solver can be used to solve the reallocation model. We report computational results on a large set of capacitated VRP instances from the literature (with symmetric/asymmetric costs and with/without distance constraints), along with an analysis of the performance of the new method and of its features. Interestingly, in 13 cases the new method was able to improve the best-know solution available from the literature. Work supported by M.I.U.R. and by C.N.R., Italy.  相似文献   

17.
Let P be a simple rectilinear polygon with n vertices. There are k points in P. The maxian problem is to locate a single facility in P so as to maximize the sum of its distance from it to the k points. We present an O((n×k)logn) time algorithm for this problem.  相似文献   

18.
This paper deals with the quality of approximative solutions for the Subset-Sum-Maximization-Problem maximize $$\sum\limits_{i = l}^n {a_i x_i } $$ subject to $$\sum\limits_{i = l}^n {a_i x_i } \leqslant b$$ wherea l,...,an,bεR+ andx l,...xnε{0,1}. produced by certain heuristics of a Greedy-type. Every heuristic under consideration realizes a feasible solution (x 1, ..., xn) whose objective value is less or equal the optimal value, which is of course not greater thanb. We use the gap between capacityb and realized value as an upper bound for the error made by the heuristic and as a criterion for quality. Under the stochastic model:a 1, ..., an, b independent,a 1...,an uniformly distributed on [0, 1], b uniformly distributed on [0,n] we derive the gap-distributions and the expected size of the gaps. The analyzed algorithms include four algorithms which can be done in linear time and four heuristics which require sorting, which means that they are done inO(nlnn) time.  相似文献   

19.
LetG=(V,E) be an undirected graph with positive integer edge lengths. Letm be the number of edges inE, and letd be the sum of the edge lengths. We prove that the solution value to the continuousp-center location problem is a rationalp 1/p 2, where logp 1=O(m 5 logd+m 6 logp),i=1,2. This result is then used to construct a finite algorithm for the continuousp-center problem.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号