期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Fast index for approximate string matching

Dekel Tsur 《Journal of Discrete Algorithms》2010,8(4):339-345

We present an index that stores a text of length n such that given a pattern of length m, all the substrings of the text that are within Hamming distance (or edit distance) at most k from the pattern are reported in

O (m + \log \log n + # matches)

time (for constant k). The space complexity of the index is

O (n^{1 + ?})

for any constant

? > 0

. 相似文献

2.

Data structures and algorithms for approximate string matching

《Journal of Complexity》1988,4(1):33-72

This paper surveys techniques for designing efficient sequential and parallel approximate string matching algorithms. Special attention is given to the methods for the construction of data structures that efficiently support primitive operations needed in approximate string matching. 相似文献

3.

New and faster filters for multiple approximate string matching

Ricardo Baeza‐Yates Gonzalo Navarro 《Random Structures and Algorithms》2002,20(1):23-49

We present three new algorithms for on‐line multiple string matching allowing errors. These are extensions of previous algorithms that search for a single pattern. The average running time achieved is in all cases linear in the text size for moderate error level, pattern length, and number of patterns. They adapt (with higher costs) to the other cases. However, the algorithms differ in speed and thresholds of usefulness. We theoretically analyze when each algorithm should be used, and show their performance experimentally. The only previous solution for this problem allows only one error. Our algorithms are the first to allow more errors, and are faster than previous work for a moderate number of patterns (e.g. less than 50–100 on English text, depending on the pattern length). © 2002 John Wiley & Sons, Inc. Random Struct. Alg., 20, 23–49, 2002 相似文献

4.

Bit-parallel approximate string matching algorithms with transposition 总被引：1，自引：0，他引：1

Heikki Hyyr 《Journal of Discrete Algorithms》2005,3(2-4):215-229

Using bit-parallelism has resulted in fast and practical algorithms for approximate string matching under Levenshtein edit distance, which permits a single edit operation to insert, delete or substitute a character. Depending on the parameters of the search, currently the fastest non-filtering algorithms in practice are the O(km/wn) algorithm of Wu and Manber, the O((k+2)(m−k)/wn) algorithm of Baeza-Yates and Navarro, and the O(m/wn) algorithm of Myers, where m is the pattern length, n is the text length, k is the error threshold and w is the computer word size. In this paper we discuss a uniform way of modifying each of these algorithms to permit also a fourth type of edit operation: transposing two adjacent characters in the pattern. This type of edit distance is also known as Damerau edit distance. In the end we also present an experimental comparison of the resulting algorithms. 相似文献

5.

Discrete operator convolutions and some of their applications

V. S. Rabinovich 《Mathematical Notes》1992,51(5):484-492

相似文献

6.

A graph-theoretic model to solve the approximate string matching problem allowing for translocations

《Journal of Discrete Algorithms》2013

In this paper, we study the approximate string matching problem under a string distance whose edit operations are translocations of equal length factors. We extend a graph-theoretic approach proposed by Rahman and Illiopoulos (2008) to model our problem. In the sequel, we devise efficient algorithms based on this model to solve a number of variants of the string matching problem with translocations. 相似文献

7.

The longest common extension problem revisited and applications to approximate string searching

Lucian Ilie Gonzalo Navarro Liviu Tinta 《Journal of Discrete Algorithms》2010,8(4):418-428

相似文献

8.

Fast and flexible packed string matching

《Journal of Discrete Algorithms》2014

Searching for all occurrences of a pattern in a text is a fundamental problem in computer science with applications in many other fields, like natural language processing, information retrieval and computational biology. In the last two decades a general trend has appeared trying to exploit the power of the word RAM model to speed-up the performances of classical string matching algorithms. In this model an algorithm operates on words of length w, grouping blocks of characters, and arithmetic and logic operations on the words take one unit of time.In this paper we use specialized word-size packed string matching instructions, based on the Intel streaming SIMD extensions (SSE) technology, to design a very fast string matching algorithm. We evaluate our solution in terms of efficiency, stability and flexibility, where we propose to use the deviation in running time of an algorithm on distinct equal length patterns as a measure of stability.From our experimental results it turns out that, despite their quadratic worst case time complexity, the new presented algorithm becomes the clear winner on the average in many cases, when compared against the most recent and effective algorithms known in literature. 相似文献

9.

An integer linear programming approach for approximate string comparison

Marcus Ritt Alysson M. Costa Sergio Mergen Viviane M. Orengo 《European Journal of Operational Research》2009

We introduce a problem called maximum common characters in blocks (MCCB), which arises in applications of approximate string comparison, particularly in the unification of possibly erroneous textual data coming from different sources. We show that this problem is NP-complete, but can nevertheless be solved satisfactorily using integer linear programming for instances of practical interest. Two integer linear formulations are proposed and compared in terms of their linear relaxations. We also compare the results of the approximate matching with other known measures such as the Levenshtein (edit) distance. 相似文献

10.

Simple,compact and robust approximate string dictionary

《Journal of Discrete Algorithms》2014

相似文献

11.

A linear size index for approximate pattern matching

Ho-Leung Chan Tak-Wah Lam Wing-Kin Sung Siu-Lung Tam Swee-Seong Wong 《Journal of Discrete Algorithms》2011,9(4):358-364

相似文献

12.

Calculus rules for global approximate minima and applications to approximate subdifferential calculus

M. Volle 《Journal of Global Optimization》1994,5(2):131-157

We provide calculus rules for global approximate minima concerning usual operations on functions. The formulas we obtain are then applied to approximate subdifferential calculus. In this way, new results are presented, for example on the approximate subdifferential of a deconvolution, or on the subdifferential of an upper envelope of convex functions. 相似文献

13.

FFT-based algorithms for the string matching with mismatches problem

Tor Schoenmeyr David Yu Zhang 《Journal of Algorithms in Cognition, Informatics and Logic》2005,57(2):130-139

The string matching with mismatches problem requires finding the Hamming distance between a pattern P of length m and every length m substring of text T with length n. Fischer and Paterson's FFT-based algorithm solves the problem without error in O(σnlogm), where σ is the size of the alphabet Σ [SIAM–AMS Proc. 7 (1973) 113–125]. However, this in the worst case reduces to O(nmlogm). Atallah, Chyzak and Dumas used the idea of randomly mapping the letters of the alphabet to complex roots of unity to estimate the score vector in time O(nlogm) [Algorithmica 29 (2001) 468–486]. We show that the algorithm's score variance can be substantially lowered by using a bijective mapping, and specifically to zero in the case of binary and ternary alphabets. This result is extended via alphabet remappings to deterministically solve the string matching with mismatches problem with a constant factor of 2 improvement over Fischer–Paterson's method. 相似文献

14.

Generalized boundary value problems for analytic functions with convolutions and its applications

Pingrun Li 《Mathematical Methods in the Applied Sciences》2019,42(8):2631-2645

In this paper, we first establish a locality theory for the Noethericity of generalized boundary value problems on the spaces . By means of this theory, of the classical boundary value theory, and of the theory of Fourier analysis, we discuss the necessary and sufficient conditions of the solvability and obtain the general solutions and the Noether conditions for one class of generalized boundary value problems. All cases as regards the index of the coefficients in the equations are considered in detail. Moreover, we apply our theoretical results to the solvability of singular integral equations with variable coefficients. Thus, this paper will be of great significance for the study of improving and developing complex analysis, integral equation, and boundary value theory. 相似文献

15.

On minimizing pattern splitting in multi-track string matching

Kjell Lemstrm Veli Mkinen 《Journal of Discrete Algorithms》2005,3(2-4):248-266

Given a pattern string P=p₁p₂p_m and K parallel text strings over an integer alphabet Σ, our task is to find the smallest integer κ>0 such that P can be split into κ pieces P=P¹…P^κ, where each Pⁱ has an occurrence in some text track T^k_i and these partial occurrences retain the order. We study some variations of this minimum splitting problem, such as splittings with limited gaps and transposition invariance, and show how to use sparse dynamic programming to solve the variations efficiently. In particular, we show that the minimum splitting problem can be interpreted as a shortest path problem on line segments. 相似文献

16.

Faster algorithms for string matching with k mismatches

Amihood Amir Moshe Lewenstein Ely Porat 《Journal of Algorithms in Cognition, Informatics and Logic》2004,50(2):257-275

The string matching with mismatches problem is that of finding the number of mismatches between a pattern P of length m and every length m substring of the text T. Currently, the fastest algorithms for this problem are the following. The Galil–Giancarlo algorithm finds all locations where the pattern has at most k errors (where k is part of the input) in time O(nk). The Abrahamson algorithm finds the number of mismatches at every location in time . We present an algorithm that is faster than both. Our algorithm finds all locations where the pattern has at most k errors in time . We also show an algorithm that solves the above problem in time O((n+(nk³)/m)logk). 相似文献

17.

A new algorithm for two-dimensional object inspection using string matching

《Mathematical and Computer Modelling》1998,27(1):101-116

In this paper, a string matching scheme is proposed to inspect two-dimensional objects for dimensional and shape verification in industrial environment. This approach consists of two stages. First, the procedures of determining the invariant starting point for boundary tracing and locating the cornerpoints of a curved object for polygon approximation are derived. To speed up the process, an optimization-based unconstrained line search method is used to locate the cornerpoints of the polygon image of a curved object. These cornerpoints are then recorded as feature string. At last, the feature string for each tested object are utilized to find the exact correspondence to one of several model objects. In contrast to conventional matching methods, which requires translation and rotation of the tested image before matching, the proposed method proves to be computationally efficient for real-time applications. 相似文献

18.

Efficient use of parallelism in algorithmic parameter optimization applications

C. Audet C.-K. Dang D. Orban 《Optimization Letters》2013,7(3):421-433

In the context of algorithmic parameter optimization, there is much room for efficient usage of computational resources. We consider the Opal framework in which a nonsmooth optimization problem models the parameter identification task, and is solved by a mesh adaptive direct search solver. Each evaluation of trial parameters requires the processing of a potentially large number of independent tasks. We describe and evaluate several strategies for using parallelism in this setting. Our test scenario consists in optimizing five parameters of a trust-region method for smooth unconstrained minimization. 相似文献

19.

Some estimates of the derivatives of algebraic polynomials and their applications to approximate differentiation

N. S. Baiguzov 《Mathematical Notes》1969,5(2):112-118

Asymptotics are found of the exact upper bound of the maximum of the derivatives on an interval for a class of algebraic polynomials of a specified degree which possess a majorization on a given set of points in the basic segment. Application to approximate differentiation is given.Translated from Matematicheskie Zametki, Vol. 5, No. 2, pp. 183–194, February, 1969. 相似文献

20.

On an approximate description of string andp-brane dynamics in curved space-time

A. A. Zheltukhin S. N. Roshchupkin 《Theoretical and Mathematical Physics》1997,111(3):714-722

A new formulation for the action of the Nambu-Goto string moving in curved space-time is studied. The action contains a two-dimensional “cosmological” world-sheet term multiplied by a tension parameter. It is shown that when treating this cosmological term as a perturbation, nonlinear equations for the string can be linearized. It is argued that in the zeroth-order approximation with respect to the tension, strings and membranes can be considered as dominant gravitation sources for the Friedmann universe with κ=0. Translated from Teoreticheskaya i Matematicheskaya Fizika, Vol. 111, No. 3, pp. 402–412, June, 1997. 相似文献