首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this paper, we define a family of patterns with don't cares occurring in a text. We call them primitive patterns. The set of primitive patterns forms a basis for all the maximal patterns occurring in the text. The number of primitive patterns is smaller than other known basis.

We present an incremental algorithm that computes the primitive patterns occurring at least q times in a text of length n, given the N primitive patterns occurring at least q−1 times, in time O(|Σ|Nn2logn), where Σ is the alphabet. In the particular case where q=2, the complexity in time is only O(|Σ|n2logn). We also give an algorithm that decides if a given pattern is primitive in a given text. These algorithms are generalized, taking many sequences in input. Finally, we give a solution for reducing the number of patterns of interest by using scoring techniques, as we show that the number of primitive patterns is exponential.  相似文献   


2.
In this paper we discuss farthest-point problems in which a set or sequence S of n points in the plane is given in advance and can be preprocessed to answer various queries efficiently. First, we give a data structure that can be used to compute the point farthest from a query line segment in O(log2n) time. Our data structure needs O(nlogn) space and preprocessing time. To the best of our knowledge no solution to this problem has been suggested yet. Second, we show how to use this data structure to obtain an output-sensitive query-based algorithm for polygonal path simplification. Both results are based on a series of data structures for fundamental farthest-point queries that can be reduced to each other.  相似文献   

3.
In this work we examine to what extent small cancellation conditions imply the solution of the membership problem of the prefix monoid in one-relator groups. The main results show that with additional combinatorial conditions on the defining relator (prefix pureness and strongly cyclically reduceness), which synchronize the prefixes and an analogue of the Greendlinger Lemma for the set of prefixes (Property P( \(n\) )), the membership problem for the prefix monoid can be solved in cases not covered by previous results.  相似文献   

4.
The time complexity of suffix tree construction has been shown to be equivalent to that of sorting: O(n) for a constant-size alphabet or an integer alphabet and O(nlogn) for a general alphabet. However, previous algorithms for constructing suffix arrays have the time complexity of O(nlogn) even for a constant-size alphabet.

In this paper we present a linear-time algorithm to construct suffix arrays for integer alphabets, which do not use suffix trees as intermediate data structures during its construction. Since the case of a constant-size alphabet can be subsumed in that of an integer alphabet, our result implies that the time complexity of directly constructing suffix arrays matches that of constructing suffix trees.  相似文献   


5.
Given a set of moving points in d, we show how to cluster them in advance, using a small number of clusters, so that at any time this static clustering is competitive with the optimal k-center clustering at that time. The advantage of this approach is that it avoids updating the clustering as time passes. We also show how to maintain this static clustering efficiently under insertions and deletions. To implement this static clustering efficiently, we describe a simple technique for speeding up clustering algorithms and apply it to achieve faster clustering algorithms for several problems. In particular, we present a linear time algorithm for computing a 2-approximation to the k-center clustering of a set of n points in d. This slightly improves the algorithm of Feder and Greene, that runs in (n log k) time (which is optimal in the algebraic decision tree model).  相似文献   

6.
In this paper, we present a direct approach for routing a shortest rectilinear path between two points among a set of rectilinear obstacles in a two-layer interconnection model that is used for VLSI routing applications. The previously best known direct approach for this problem takes O(nlog2n) time and O(nlogn) space, where n is the total number of obstacle edges. By using integer data structures and an implicit graph representation scheme (i.e., a generalization of the distance table method), we improve the time bound to O(nlog3/2n) while still maintaining the O(nlogn) space bound. Comparing with the indirect approach for this problem, our algorithm is simpler to implement and is probably faster for a quite large range of input sizes.  相似文献   

7.
Bit-parallel approximate string matching algorithms with transposition   总被引:1,自引:0,他引:1  
Using bit-parallelism has resulted in fast and practical algorithms for approximate string matching under Levenshtein edit distance, which permits a single edit operation to insert, delete or substitute a character. Depending on the parameters of the search, currently the fastest non-filtering algorithms in practice are the O(km/wn) algorithm of Wu and Manber, the O((k+2)(mk)/wn) algorithm of Baeza-Yates and Navarro, and the O(m/wn) algorithm of Myers, where m is the pattern length, n is the text length, k is the error threshold and w is the computer word size. In this paper we discuss a uniform way of modifying each of these algorithms to permit also a fourth type of edit operation: transposing two adjacent characters in the pattern. This type of edit distance is also known as Damerau edit distance. In the end we also present an experimental comparison of the resulting algorithms.  相似文献   

8.
We study structured matrices which consist of a band part and quasiseparable parts below and upper the band. We extend algorithms known for quasiseparable matrices, i.e. for the case when the band consists of the main diagonal only, to a wider class of matrices. The matrices which we consider may be treated as an usual quasiseparable matrices with larger orders of generators. Hence one can apply the methods developed for usual quasiseparable matrices and obtain various linear complexity O(N) algorithms. However in this case the coefficients in N in the complexity estimates turns out to be quite large. In this paper we use the structure more accurately by division of the matrix into three parts in which the middle part is the band instead of diagonal as it is used for usual quasiseparable matrices. This approach allows to use better the structure of the matrix in order to improve the coefficients in N in the complexity estimates for the algorithms. This method works for algorithms which keep invariant the structure.  相似文献   

9.
Triesare data structures for storing sets where each element is represented by a key that can be viewed as a string of characters over a finite alphabet. These structures have been extensively studied and analyzed under several probability models. All of these models, however, preclude the occurrence of sets in which the key of one element is a prefix of that of another—such a key is called aprefixing-key. This paper presents an average case analysis of several trie varieties, which we generically calledprefixing-tries, for representing sets with “unrestricted” keys, that is, sets in which the key of one element may be a prefix of that of another. The underlying probability model, which we call theprefix model, h, n, massumes as equally likely alln-element sets whose keys are composed of at mosthcharacters from a fixed alphabet of sizem. For each of the trie varieties analyzed, we derive exact formulas for the expected space required to store such a set, and the average time required to retrieve an element given its key, as functions ofh,n, andm. Our approach to the analysis is of interest in its own right. It provides a unifying framework for computing the expectations of a wide class of random variables with respect to the prefix model. This class includes the cost functions of the trie varieties analyzed here.  相似文献   

10.
In this work, we introduce a local search strategy for combinatorial optimization problems which explores neighborhoods obtained using fragments of current solutions. We apply the approach to the well-known -hard 2-machine bicriteria flowshop scheduling problem. Computational experiments using benchmark data show the approach to be effective when compared to other algorithms available for the problem.  相似文献   

11.
In k-means clustering we are given a set of n data points in d-dimensional space and an integer k, and the problem is to determine a set of k points in  , called centers, to minimize the mean squared distance from each data point to its nearest center. No exact polynomial-time algorithms are known for this problem. Although asymptotically efficient approximation algorithms exist, these algorithms are not practical due to the very high constant factors involved. There are many heuristics that are used in practice, but we know of no bounds on their performance.

We consider the question of whether there exists a simple and practical approximation algorithm for k-means clustering. We present a local improvement heuristic based on swapping centers in and out. We prove that this yields a (9+)-approximation algorithm. We present an example showing that any approach based on performing a fixed number of swaps achieves an approximation factor of at least (9−) in all sufficiently high dimensions. Thus, our approximation factor is almost tight for algorithms based on performing a fixed number of swaps. To establish the practical value of the heuristic, we present an empirical study that shows that, when combined with Lloyd's algorithm, this heuristic performs quite well in practice.  相似文献   


12.
We present parallel lightweight algorithms to construct wavelet trees, rank and select structures, and suffix arrays in a shared-memory setting. The work and depth of our first parallel wavelet tree algorithm match those of the best existing parallel algorithm while requiring asymptotically less memory and our second algorithm achieves the same asymptotic bounds for small alphabet sizes. Our experiments show that they are both faster and more memory-efficient than existing parallel algorithms. We also present an experimental evaluation of the parallel construction of rank and select structures, which are used in wavelet trees. Next, we design the first parallel suffix array algorithm based on induced copying. Our induced copying requires linear work and polylogarithmic depth for constant alphabets sizes. When combined with a parallel prefix doubling algorithm, it is more efficient in practice both in terms of running time and memory usage compared to existing parallel implementations. As an application, we combine our algorithms to build the FM-index in parallel.  相似文献   

13.
In this paper, we initiate the study of Ehrenfeucht–Fraïssé games for some standard finite structures. Examples of such standard structures are equivalence relations, trees, unary relation structures, Boolean algebras, and some of their natural expansions. The paper concerns the following question that we call the Ehrenfeucht–Fraïssé problem. Given nω as a parameter, and two relational structures and from one of the classes of structures mentioned above, how efficient is it to decide if Duplicator wins the n-round EF game ? We provide algorithms for solving the Ehrenfeucht–Fraïssé problem for the mentioned classes of structures. The running times of all the algorithms are bounded by constants. We obtain the values of these constants as functions of n.  相似文献   

14.
We look at the computational complexity of 2-dimensional geometric optimization problems on a finite point set with respect to the number of inner points (that is, points in the interior of the convex hull). As a case study, we consider the minimum weight triangulation problem. Finding a minimum weight triangulation for a set of n points in the plane is not known to be NP-hard nor solvable in polynomial time, but when the points are in convex position, the problem can be solved in O(n3) time by dynamic programming. We extend the dynamic programming approach to the general problem and describe an exact algorithm which runs in O(6kn5logn) time where n is the total number of input points and k is the number of inner points. If k is taken as a parameter, this is a fixed-parameter algorithm. It also shows that the problem can be solved in polynomial time if k=O(logn). In fact, the algorithm works not only for convex polygons, but also for simple polygons with k inner points.  相似文献   

15.
We consider an extension of the set union problem, in which dynamic weighted backtracking over sequences of unions is permitted. We present a new data structure which can support each operation inO(logn) time in the worst case. We prove that this bound is tight for pointer based algorithms. Furthermore, we design a different data structure to achieve better amortized bounds. The space complexity of both our data structures isO(n). Motivations for studying this problem arise in logic programming memory management.Work partially supported by NSF Grant CCR-88-14977, by the ESPRIT II Basic Research Actions Program of the EC under contract No. 3075 (Project ALCOM), and by the Italian MURST Project Algoritmi e Strutture di Calcolo.Partially supported by an IBM Graduate Fellowship.  相似文献   

16.
Active set algorithms for isotonic regression; A unifying framework   总被引:1,自引:0,他引:1  
In this and subsequent papers we will show that several algorithms for the isotonic regression problem may be viewed as active set methods. The active set approach provides a unifying framework for studying algorithms for isotonic regression, simplifies the exposition of existing algorithms and leads to several new efficient algorithms. We also investigate the computational complexity of several algorithms.In this paper we consider the isotonic regression problem with respect to a complete order where eachw i is strictly positive and eachy i is an arbitrary real number. We show that the Pool Adjacent Violators algorithm (due to Ayer et al., 1955; Miles, 1959; Kruskal, 1964), is a dual feasible active set method and that the Minimum Lower Set algorithm (due to Brunk et al., 1957) is a primal feasible active set method of computational complexity O(n 2). We present a new O(n) primal feasible active set algorithm. Finally we discuss Van Eeden's method and show that it is of worst-case exponential time complexity.This work was supported by the National Science and Engineering Research Council of Canada under Research Grant A8189 and an Ontario Graduate Scholarship.  相似文献   

17.
We present a new linear time algorithm to compute a good order for the point set of a Delaunay triangulation in the plane. Such a good order makes reconstruction in linear time with a simple algorithm possible. Similarly to the algorithm of Snoeyink and van Kreveld [Proceedings of 5th European Symposium on Algorithms (ESA), 1997, pp. 459–471], our algorithm constructs such orders in O(logn) phases by repeatedly removing a constant fraction of vertices from the current triangulation. Compared to [Proceedings of 5th European Symposium on Algorithms (ESA), 1997, pp. 459–471] we improve the guarantee on the number of removed vertices in each such phase. If we restrict the degree of the points (at the time they are removed) to 6, our algorithm removes at least 1/3 of the points while the algorithm from [Proceedings of 5th European Symposium on Algorithms (ESA), 1997, pp. 459–471] gives a guarantee of 1/10. We achieve this improvement by removing the points sequentially using a breadth first search (BFS) based procedure that—in contrast to [Proceedings of 5th European Symposium on Algorithms (ESA), 1997, pp. 459–471]—does not (necessarily) remove an independent set.

Besides speeding up the algorithm, removing more points in a single phase has the advantage that two consecutive points in the computed order are usually closer to each other. For this reason, we believe that our approach is better suited for vertex coordinate compression.

We implemented prototypes of both algorithms and compared their running time on point sets uniformly distributed in the unit cube. Our algorithm is slightly faster. To compare the vertex coordinate compression capabilities of both algorithms we round the resulting sequences of vertex coordinates to 16-bit integers and compress them with a simple variable length code. Our algorithm achieves about 14% better vertex data compression than the algorithm from [Proceedings of 5th European Symposium on Algorithms (ESA), 1997, pp. 459–471].  相似文献   


18.
Let B(G) denote the bipartite double cover of a non-bipartite graph G with v≥2 vertices and ? edges. We prove that G is a perfect 2-matching covered graph if and only if B(G) is a 1-extendable graph. Furthermore, we prove that B(G) is a minimally 1-extendable graph if and only if G is a minimally perfect 2-matching covered graph and for each e = xyE(G), there is an independent set S in G such that |ΓG(S)| = |S| + 1, x S and |ΓG-xy(S) | = |S|. Then, we construct a digraph D from B(G) or G and show that D is a strongly connected digraph if and only if G is a perfect 2-matching covered graph. So we design an algorithm in O(v?) time that determines whether G is a perfect 2-matching covered graph or not.  相似文献   

19.
Given a set of points in the plane and a constant t1, a Euclidean t-spanner is a network in which, for any pair of points, the ratio of the network distance and the Euclidean distance of the two points is at most t. Such networks have applications in transportation or communication network design and have been studied extensively.

In this paper we study 1-spanners under the Manhattan (or L1-) metric. Such networks are called Manhattan networks. A Manhattan network for a set of points is a set of axis-parallel line segments whose union contains an x- and y-monotone path for each pair of points. It is not known whether it is NP-hard to compute minimum Manhattan networks (MMN), i.e., Manhattan networks of minimum total length. In this paper we present an approximation algorithm for this problem. Given a set P of n points, our algorithm computes in O(nlogn) time and linear space a Manhattan network for P whose length is at most 3 times the length of an MMN of P.

We also establish a mixed-integer programming formulation for the MMN problem. With its help we extensively investigate the performance of our factor-3 approximation algorithm on random point sets.  相似文献   


20.
Given a set R of red points and a set B of blue points, the nearest-neighbour decision rule classifies a new point q as red (respectively, blue) if the closest point to q in R B comes from R (respectively, B). This rule implicitly partitions space into a red set and a blue set that are separated by a red-blue decision boundary. In this paper we develop output-sensitive algorithms for computing this decision boundary for point sets on the line and in 2. Both algorithms run in time O(n log k), where k is the number of points that contribute to the decision boundary. This running time is the best possible when parameterizing with respect to n and k.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号