首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper, we consider the pattern matching problem in DNA and RNA sequences where either the pattern or the text can be degenerate, i.e., contain sets of characters. We present an asymptotically faster algorithm for the above problem that works in O(n log m) time, where n and m is the length of the text and the pattern respectively. We also suggest an efficient implementation of our algorithm, which works in linear time when the pattern size is small. Finally, we also describe how our approach can be used to solve the distributed pattern matching problem. The preliminary version of this paper appeared in [26].  相似文献   

2.
Given two strings, the longest common subsequence (LCS) problem consists in computing the length of the longest string that is a subsequence of both input strings. Its generalisation, the all semi-local LCS problem, requires computing the LCS length for each string against all substrings of the other string, and for all prefixes of each string against all suffixes of the other string. We survey a number of algorithmic techniques related to the all semi-local LCS problem. We then present a number of algorithmic applications of these techniques, both existing and new. In particular, we obtain a new all semi-local LCS algorithm, with asymptotic running time matching (in the case of an unbounded alphabet) the fastest known global LCS algorithm by Masek and Paterson. We conclude that semi-local string comparison turns out to be a useful algorithmic plug-in, which unifies, and often improves on, a number of previous approaches to various substring- and subsequence-related problems. The author acknowledges the support of The University of Warwick’s DIMAP (the Centre for Discrete Mathematics and its Applications) during this work.  相似文献   

3.
Multi-level overlay graphs represent a speed-up technique for shortest paths computation which is based on a hierarchical decomposition of a weighted directed graph G. They have been shown to be experimentally efficient, especially when applied to timetable information. However, no theoretical result on the cost of constructing, maintaining and querying multi-level overlay graphs in a dynamic environment is known. In this paper, we show theoretical properties of multi-level overlay graphs that lead us to the definition of a new data structure for the computation and the maintenance of an overlay graph of G while weight decrease or weight increase operations are performed on G. Our solution is theoretically faster than the recomputation from scratch and allows queries that can be performed more efficiently than running Dijkstra’s shortest paths algorithm on G. This work was partially supported by the Future and Emerging Technologies Unit of EC (IST priority – 6th FP), under contract no. FP6-021235-2 (project ARRIVAL).  相似文献   

4.
For 30 years the Lempel–Ziv factorization LZ x of a string xx[1..n] has been a fundamental data structure of string processing, especially valuable for string compression and for computing all the repetitions (runs) in x. Traditionally the standard method for computing LZ x was based on Θ(n)-time (or, depending on the measure used, O(n log n)-time) processing of the suffix tree ST x of x. Recently Abouelhoda et al. proposed an efficient Lempel–Ziv factorization algorithm based on an “enhanced” suffix array – that is, a suffix array SA x together with supporting data structures, principally an “interval tree”. In this paper we introduce a collection of fast space-efficient algorithms for LZ factorization, also based on suffix arrays, that in theory as well as in many practical circumstances are superior to those previously proposed; one family out of this collection achieves true Θ(n)-time alphabet-independent processing in the worst case by avoiding tree structures altogether. The work of the first and third authors was supported in part by grants from the Natural Sciences & Engineering Research Council of Canada.  相似文献   

5.
Patterns consisting of strings with a bounded number of mismatches are central to coding theory and find multiple applications in text processing and computational biology. In this latter field, the presence of over-represented patterns of this kind has been linked, for instance, to modeling regulatory regions in biosequences. The study and computation of expected number of occurrences and related scores for these patterns is made difficult by the sheer explosion of the roster of candidates that need to be evaluated. In recent work, properties of pattern saturation and score monotonicity have proved capable to mitigate this problem. In such a context, expectation and score monotonicity has been established within the i.i.d. model for all cases of interest except that of a fixed word length with a varying number of mismatches. The present paper completes this investigation by showing that the expected number of occurrences in a textstring for such a word is bi-tonic, that is, behaves as a unimodal function of the number of errors. This extends to this case the time and space savings brought about by discovery algorithms based on pattern saturation. Work Supported in part by the Italian Ministry of University and Research under the Bi-National Project FIRB RBIN04BYZ7, and by the Research Program of Georgia Tech. An extended abstract related to this work was presented at the Dagstuhl Seminar Dagstuhl on “Combinatorial and Algorithmic Foundations of Pattern and Association Discovery”, May 14-19, 2006 [3].  相似文献   

6.
Obtaining a matching in a graph satisfying a certain objective is an important class of graph problems. Matching algorithms have received attention for several decades. However, while there are efficient algorithms to obtain a maximum weight matching, not much is known about the maximum weight maximum cardinality, and maximum cardinality maximum weight matching problems for general graphs. Our contribution in this work is to show that for bounded weight input graphs one can obtain an algorithm for both maximum weight maximum cardinality (for real weights), and maximum cardinality maximum weight matching (for integer weights) by modifying the input and running the existing maximum weight matching algorithm. Also, given the current state of the art in maximum weight matching algorithms, we show that, for bounded weight input graphs, both maximum weight maximum cardinality, and maximum cardinality maximum weight matching have algorithms of similar complexities to that of maximum weight matching. Subsequently, we also obtain approximation algorithms for maximum weight maximum cardinality, and maximum cardinality maximum weight matching.   相似文献   

7.
A General Tractable Density Concept for Graphs   总被引:1,自引:0,他引:1  
In many applications it is an important algorithmic task to find a densest subgraph in an input graph. The complexity of this task depends on how density is defined. If density means the ratio of the number of edges and the number of vertices in the subgraph, then the algorithmic problem has long been known efficiently solvable. On the other hand, the task becomes NP-hard with closely related but somewhat modified concepts of density. To capture many possible tractable density concepts of interest in a common model, we define and analyze a general concept of density, called F-density. Here F is a family of graphs and we are looking for a subgraph of the input graph, such that this subgraph is the densest in terms of containing the highest number of graphs from F relative to the size of the subgraph. We show that for any fixed finite family F, a subgraph of maximum F-density can be found in polynomial time. As our main tool we develop an algorithm, that may be of independent interest, which can find an independent set of maximum independence ratio in a certain class of weighted graphs. The independence ratio is the weight of the independent set divided by the weight of its neighborhood. This work was supported in part by NSF grants ANI-0220001 and CCF-0634848.  相似文献   

8.
Smooth 4-regular hamiltonian graphs are generalizations of cycle plus triangles graphs. It has been shown that both the independent set and 3-colorability problems are NP-Complete in this class of graphs. In this paper we show that these problems are fixed parameter tractable if we choose the number of inner cycles as parameter. The reseach has been supported by International Science Programme (ISP) of Sweden, under the project titled “The Eastern African Universities Mathematics Programme (EAUMP)”.  相似文献   

9.
Microarrays offer unprecedented possibilities for the so-called omic, e.g., genomic and proteomic, research. However, they are also quite challenging data to analyze. The aim of this paper is to provide a short tutorial on the most common approaches used for pattern discovery and cluster analysis as they are currently used for microarrays, in the hope to bring the attention of the Algorithmic Community on novel aspects of classification and data analysis that deserve attention and have potential for high reward. R. Giancarlo is partially supported by Italian MIUR grants PRIN “Metodi Combinatori ed Algoritmici per la Scoperta di Patterns in Biosequenze” and FIRB “Bioinformatica per la Genomica e la Proteomica” and Italy-Israel FIRB Project “Pattern Discovery Algorithms in Discrete Structures, with Applications to Bioinformatics”. D. Scaturro is supported by a MIUR Fellowship in the Italy-Israel FIRB Project “Pattern Discovery Algorithms in Discrete Structures, with Applications to Bioinformatics”.  相似文献   

10.
A loopQ is said to be left conjugacy closed (LCC) if the left translations form a set of permutations that is closed under conjugation. This papers investigates those LCC loops where the group generated by left translations is normal in the group generated by both left and right translations.  相似文献   

11.
We describe all group gradings on the diagonal algebra k n , where k is an arbitrary field. Received: 21 January 2008  相似文献   

12.
Is it possible to symbolically express and analyse an individual-based model of disease spread, including realistic population dynamics? This problem is addressed through the use of process algebra and a novel method for transforming process algebra into Mean Field Equations. A number of stochastic models of population growth are presented, exploring different representations based on alternative views of individual behaviour. The overall population dynamics in terms of mean field equations are derived using a formal and rigorous rewriting based method. These equations are easily compared with the traditionally used deterministic Ordinary Differential Equation models and allow evaluation of those ODE models, challenging their assumptions about system dynamics. The utility of our approach for epidemiology is confirmed by constructing a model combining population growth with disease spread and fitting it to data on HIV in the UK population. This work was supported by EPSRC through a Doctoral Training Grant (CM, from 2004–2007), and through System Dynamics from Individual Interactions: A process algebra approach to epidemiology (EP/E006280/1, all authors, 2007–2010).  相似文献   

13.
In this paper we investigate harmonic Hardy-Orlicz and Bergman-Orlicz b φ,α (B) spaces, using an identity of Hardy-Stein type. We also extend the notion of the Lusin property by introducing (φ, α)-Lusin property with respect to a Stoltz domain. The main result in the paper is as follows: Let be a nonnegative increasing convex function twice differentiable on (0, ∞), and u a harmonic function on the unit ball B in . Then the following statements are equivalent:
(a)  .
(b)  .
(c)  u has (φ, α)-Lusin property with respect to a Stoltz domain with half-angle β, for any .
(d)  u has (φ, α)-Lusin property with respect to a Stoltz domain with half-angle β, for some .
  相似文献   

14.
A loop Q is said to be left conjugacy closed (LCC) if the left translations form a set of permutations that is closed under conjugation. Loops in which the left and middle nuclei coincide and are of index 2 are necesarilly LCC, and they are constructed in the paper explicitly. LCC loops Q with the right nucleus G of index 2 offer a larger diversity, but that is associated with the level of commutativity of G (amongst others, the centre of G has to be nontrivial). On the other hand, for each m ≥ 2 one can construct an LCC loop Q of order 2m in such a way that its left nucleus is trivial, and the right nucleus if of order m. If Q is involutorial, then it is a Bol loop. Work supported by institutional grant MSM 113200007 and by Grant Agency of Czech Republic, Grant 201/02/0594. The paper was written while the author was visiting Universitaet Hamburg in January 2004.  相似文献   

15.
In the first part some conditions under which a permutation preserves Buck’s measure density, are derived. In the second part is constructed a countable system of permutations which has the ergodic property on the system of Buck’s measurable sets. Received: March 27, 2007., Accepted: August 25, 2008.  相似文献   

16.
17.
In the well-known discrete modeling framework developed by R. Thomas, the structure of a biological regulatory network is captured in an interaction graph, which, together with a set of Boolean parameters, gives rise to a state transition graph describing all possible dynamical behaviors. For complex networks the analysis of the dynamics becomes more and more difficult, and efficient methods to carry out the analysis are needed. In this paper, we focus on identifying subnetworks of the system that govern the behavior of the system as a whole. We present methods to derive trajectories and attractors of the network from the dynamics suitable subnetworks display in isolation. In addition, we use these ideas to link the existence of certain structural motifs, namely circuits, in the interaction graph to the character and number of attractors in the state transition graph, generalizing and refining results presented in [10]. Lastly, we show for a specific class of networks that all possible asymptotic behaviors of networks in that class can be derived from the dynamics of easily identifiable subnetworks.   相似文献   

18.
We use positive elements of Hermitian algebras to give results on automatic continuity of algebra morphisms. Consequences and applications are also given.   相似文献   

19.
20.
Math search is a new area of research with many enabling technologies but also many challenges. Some of the enabling technologies include XML, XPath, XQuery, and MathML. Some of the challenges involve enabling search systems to recognize mathematical symbols and structures. Several math search projects have made considerable progress in meeting those challenges. One of the remaining challenges is the creation and implementation of a math query language that enables the general users to express their information needs intuitively yet precisely. This paper will present such a language and detail its features. The new math query language offers an alternative way to describe mathematical expressions that is more consistent and less ambiguous than conventional mathematical notation. In addition, the language goes beyond the Boolean and proximity query syntax found in standard text search systems. It defines a powerful set of wildcards that are deemed important for math search. These wildcards provide for more precise structural search and multi-levels of abstractions. Three new sets of wildcards and their implementation details will also be discussed.   相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号