首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Character sets of strings   总被引:2,自引:1,他引:1  
Given a string S over a finite alphabet Σ, the character set (also called the fingerprint) of a substring S of S is the subset CΣ of the symbols occurring in S. The study of the character sets of all the substrings of a given string (or a given collection of strings) appears in several domains such as rule induction for natural language processing or comparative genomics. Several computational problems concerning the character sets of a string arise from these applications, especially:
(1) Output all the maximal locations of substrings having a given character set.
(2) Output for each character set C occurring in a given string (or a given collection of strings) all the maximal locations of C.
Denoting by n the total length of the considered string or collection of strings, we solve the first problem in Θ(n) time using Θ(n) space. We present two algorithms solving the second problem. The first one runs in Θ(n2) time using Θ(n) space. The second algorithm has Θ(n|Σ|log|Σ|) time and Θ(n) space complexity and is an adaptation of an algorithm by Amir et al. [A. Amir, A. Apostolico, G.M. Landau, G. Satta, Efficient text fingerprinting via Parikh mapping, J. Discrete Algorithms 26 (2003) 1–13].  相似文献   

2.
In this paper we discuss the complexity and approximability of the minimum corridor connection problem where, given a rectilinear decomposition of a rectilinear polygon into “rooms”, one has to find the minimum length tree along the edges of the decomposition such that every room is incident to a vertex of the tree. We show that the problem is strongly NP-hard and give a subexponential time exact algorithm. For the special case when the room connectivity graph is k-outerplanar the algorithm running time becomes cubic. We develop a polynomial time approximation scheme for the case when all rooms are fat and have nearly the same size. When rooms are fat but are of varying size we give a polynomial time constant factor approximation algorithm.  相似文献   

3.
We find sharp bounds for the number of moves required to bring a permutation to the form n(n ?1),…, 1 if a move consists of inverting some increasing substrings.If we invert every maximal increasing substring in each move we need at most n ? 1 moves.If n is even and we start with 1, 2,…, n and we do not invert the entire permutation at once, then we need at least n moves.The lower bound implies that when n ? 4 is even, n points which are not collinear determine at least n different directions, as do n + 1. These bounds are sharp.  相似文献   

4.
5.
Let a textstringTofnsymbols from some alphabet Σ and an integerm < nbe given. A patternPof lengthmover Σ is sought such thatPminimizes (alternatively, maximizes) the total number of pairwise character mismatches generated whenPis compared with allm-character substrings ofT. Two additional variants of the problem are obtained by adding the constraint thatPbe (respectively, not be) a substring ofT. Efficient sequential algorithms are proposed in this paper for the problem and its variants.  相似文献   

6.
We consider the problem of fingerprinting text by sets of symbols. Specifically, if S is a string, of length n, over a finite, ordered alphabet Σ, and S′ is a substring of S, then the fingerprint of S′ is the subset φ of Σ of precisely the symbols appearing in S′. In this paper we show efficient methods of answering various queries on fingerprint statistics. Our preprocessing is done in time O(n|Σ|lognlog|Σ|) and enables answering the following queries:
(1)Given an integer k, compute the number of distinct fingerprints of size k in time O(1).
(2)Given a set φΣ, compute the total number of distinct occurrences in S of substrings with fingerprint φ in time O(|Σ|logn).
  相似文献   

7.
Let p (the pattern) be a string and t ≥ 0 an integer. The problem of locating in any string a substring whose edit distance from p is at most a given constant t is considered. An algorithm is presented to construct a deterministic finite-state automaton that solves the problem.  相似文献   

8.
We investigate vertex orders that can be used to obtain maximum stable sets by a simple greedy algorithm in polynomial time in some classes of graphs. We characterize a class of graphs for which the stability number can be obtained by a simple greedy algorithm. This class properly contains previously known classes of graphs for which the stability number can be computed in polynomial time. © 1999 John Wiley & Sons, Inc. J Graph Theory 30: 113–120, 1999  相似文献   

9.
   Abstract. We study the approximation complexity of certain kinetic variants of the Traveling Salesman Problem (TSP) where we consider instances in which each point moves with a fixed constant speed in a fixed direction. We prove the following results: • If the points all move with the same velocity, then there is a polynomial time approximation scheme for the Kinetic TSP. • The Kinetic TSP cannot be approximated better than by a factor of 2 by a polynomial time algorithm unless P = NP, even if there are only two moving points in the instance. • The Kinetic TSP cannot be approximated better than by a factor of
by a polynomial time algorithm unless P = NP, even if the maximum velocity is bounded. n denotes the size of the input instance. The last result is especially surprising in the light of existing polynomial time approximation schemes for the static version of the problem.  相似文献   

10.
String matching is the problem of finding all the occurrences of a pattern in a text. We present a new method to compute the combinatorial shift function (“matching shift”) of the well-known Boyer–Moore string matching algorithm. This method implies the computation of the length of the longest suffixes of the pattern ending at each position in this pattern. These values constituted an extra-preprocessing for a variant of the Boyer–Moore algorithm designed by Apostolico and Giancarlo. We give here a new presentation of this algorithm that avoids extra preprocessing together with a tight bound of 1.5n character comparisons (where n is the length of the text).  相似文献   

11.
This paper introduces the non-idling machine constraint where no intermediate idle time between the operations processed by a machine is allowed. In its first part, the paper considers the non-idling single-machine scheduling problem. Complexity aspects are first discussed. The “Earliest Non-Idling” property is then introduced as a sufficient condition so that an algorithm solving the original problem also solves its non-idling variant. Moreover it is shown that preemptive problems do have that property. The critical times of an instance are then introduced and it is shown that when their number is polynomial, as for equal-length jobs, a polynomial algorithm solving the original problem has a polynomial variant solving its non-idling version.  相似文献   

12.
Using concepts from both robotics and graph theory, we formulate the problem of indoor pursuit/evasion in terms of searching the nodes of a graph for a mobile evader. We present the IGNS (Iterative Greedy Node Search) algorithm, which performs offline guaranteed search (i.e. no matter how the evader moves, it will eventually be captured). Furthermore, the algorithm produces an internal search (the searchers move only along the edges of the graph; “teleporting” is not used) and exploits non-monotonicity, extended visibility and finite evader speed to reduce the number of searchers required to clear an environment. We present search experiments for several indoor environments, in all of which the algorithm succeeds in clearing the graph (i.e. capturing the evader).  相似文献   

13.
This paper studies several combinatorial problems arising from finding the conserved genes of two genomes (i.e., the entire DNA of two species). The input is a collection of n maximal common substrings of the two genomes. The problem is to find, based on different criteria, a subset of such common substrings with maximum total length. The most basic criterion requires that the common substrings selected have the same ordering in the two genomes and they do not overlap among themselves in either genome. To capture mutations (transpositions and reversals) between the genomes, we do not insist the substrings selected to have the same ordering. Conceptually, we allow one ordering to go through some mutations to become the other ordering. If arbitrary mutations are allowed, the problem of finding a maximum-length, non-overlapping subset of substrings is found to be NP-hard. However, arbitrary mutations probably overmodel the problem and are likely to find more noise than conserved genes. We consider two criteria that attempt to model sparse and non-overlapping mutations. We show that both can be solved in polynomial time using dynamic programming.   相似文献   

14.
Given two strings, the longest common subsequence (LCS) problem consists in computing the length of the longest string that is a subsequence of both input strings. Its generalisation, the all semi-local LCS problem, requires computing the LCS length for each string against all substrings of the other string, and for all prefixes of each string against all suffixes of the other string. We survey a number of algorithmic techniques related to the all semi-local LCS problem. We then present a number of algorithmic applications of these techniques, both existing and new. In particular, we obtain a new all semi-local LCS algorithm, with asymptotic running time matching (in the case of an unbounded alphabet) the fastest known global LCS algorithm by Masek and Paterson. We conclude that semi-local string comparison turns out to be a useful algorithmic plug-in, which unifies, and often improves on, a number of previous approaches to various substring- and subsequence-related problems. The author acknowledges the support of The University of Warwick’s DIMAP (the Centre for Discrete Mathematics and its Applications) during this work.  相似文献   

15.
Given a set of strings U={T1,T2,…,T}, the longest common repeat problem is to find the longest common substring that appears at least twice in each string of U. We also consider reversed and reverse-complemented repeats as well as normal repeats. We present a linear time algorithm for the longest common repeat problem.  相似文献   

16.
Steiner tree problems (STPs) are very important in both theory and practice. In this paper, we introduce a powerful swap-vertex move operator which can be used as a basic element of any neighborhood search heuristic to solve many STP variants. Given the incumbent solution tree T, the swap-vertex move operator exchanges a vertex in T with another vertex out of T, and then attempts to construct a minimum spanning tree, leading to a neighboring solution (if feasible). We develop a series of dynamic data structures, which allow us to efficiently evaluate the feasibility of swap-vertex moves. Additionally, in order to discriminate different swap-vertex moves corresponding to the same objective value, we also develop an auxiliary evaluation function. We present a computational assessment based on a number of challenging problem instances (corresponding to three representative STP variants) which clearly shows the effectiveness of the techniques introduced in this paper. Particularly, as a key element of our KTS algorithm which participated in the 11th DIMACS implementation challenge, the swap-vertex operator as well as the auxiliary evaluation function contributed significantly to the excellent performance of our algorithm.  相似文献   

17.
We give an algorithm to morph between two planar drawings of a graph, preserving planarity, but allowing edges to bend. The morph uses a polynomial number of elementary steps, where each elementary step is a linear morph that moves each vertex in a straight line at uniform speed. Although there are planarity-preserving morphs that do not require edge bends, it is an open problem to find polynomial-size morphs. We achieve polynomial size at the expense of edge bends.  相似文献   

18.
We consider the following modification of annihilation games called node blocking. Given a directed graph, each vertex can be occupied by at most one token. There are two types of tokens, each player can move only tokens of his type. The players alternate their moves and the current player i selects one token of type i and moves the token along a directed edge to an unoccupied vertex. If a player cannot make a move then he loses. We consider the problem of determining the complexity of the game: given an arbitrary configuration of tokens in a planar directed acyclic graph (dag), does the current player have a winning strategy? We prove that the problem is PSPACE-complete.  相似文献   

19.
A pebbling move on a graph consists of taking two pebbles off of one vertex and placing one pebble on an adjacent vertex. In the traditional pebbling problem we try to reach a specified vertex of the graph by a sequence of pebbling moves. In this paper we investigate the case when every vertex of the graph must end up with at least one pebble after a series of pebbling moves. The cover pebbling number of a graph is the minimum number of pebbles such that however the pebbles are initially placed on the vertices of the graph we can eventually put a pebble on every vertex simultaneously. We find the cover pebbling numbers of trees and some other graphs. We also consider the more general problem where (possibly different) given numbers of pebbles are required for the vertices.  相似文献   

20.
We consider on-line text-compression problems where compression is done by substituting substrings according to some fixed static dictionary (code book). Due to the long running time of optimal algorithms, several heuristics have been introduced in the literature. In this paper, we continue the investigations of3. We complete the worst-case analysis of the longest matching algorithm and of the differential greedy algorithm for several types of special dictionaries and we derive matching lower and upper bounds for all variants of this problem.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号