首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This paper presents a novel four-stage algorithm for the measurement of the rank correlation coefficients between pairwise financial time series. In first stage returns of financial time series are fitted as skewed-t distributions by the generalized autoregressive conditional heteroscedasticity model. In the second stage, the joint probability density function (PDF) of the fitted skewed-t distributions is computed using the symmetrized Joe–Clayton copula. The joint PDF is then utilized as the scoring scheme for pairwise sequence alignment in the third stage. After solving the optimal sequence alignment problem using the dynamic programming method, we obtain the aligned pairs of the series. Finally, we compute the rank correlation coefficients of the aligned pairs in the fourth stage. To the best of our knowledge, the proposed algorithm is the first to use a sequence alignment technique to pair numerical financial time series directly, without initially transforming numerical values into symbols. Using practical financial data, the experiments illustrate the method and demonstrate the advantages of the proposed algorithm.  相似文献   

2.
Multiple sequence alignment is a task at the heart of much of current computational biology[4]. Several different objective functions have been proposed to formalize the task of multiple sequence alignment, but efficient algorithms are lacking in each case. Thus multiple sequence alignment is one of the most critical, essentially unsolved problems in computational biology. In this paper we consider one of the more compelling objective functions for multiple sequence alignment, formalized as thetree alignment problem. Previously in[13], a ratio-two approximation method was developed for tree alignment, which ran incubictime (as a function of the number of fixed length strings to be aligned), along with a polynomial time approximation scheme (PTAS) for the problem. However, the PTAS in[13]had a running time which made it impractical to reduce the performance ratio much below two for small size biological sequences (100 characters long). In this paper we first develop a ratio-two approximation algorithm which runs inquadratictime, and then use it to develop a PTAS which has a better performance ratio and a vastly improved worst case running time compared to the scheme in[13]for the case where the given tree is a regular deg-ary tree. With the new approximation scheme, it is now practical to guarantee a ratio of 1.583 for strings of lengths 200 characters or less.  相似文献   

3.
We consider the trace reconstruction problem on a tree (TRPT): a binary sequence is broadcast through a tree channel where we allow substitutions, deletions, and insertions; we seek to reconstruct the original sequence from the sequences received at the leaves. The TRPT is motivated by the multiple sequence alignment problem in computational biology. We give a simple recursive procedure giving strong reconstruction guarantees at low mutation rates. To our knowledge, this is the first rigorous trace reconstruction result on a tree in the presence of indels.  相似文献   

4.
Protein structural alignment is an important problem in computational biology. In this paper, we present first successes on provably optimal pairwise alignment of protein inter-residue distance matrices, using the popular dali scoring function. We introduce the structural alignment problem formally, which enables us to express a variety of scoring functions used in previous work as special cases in a unified framework. Further, we propose the first mathematical model for computing optimal structural alignments based on dense inter-residue distance matrices. We therefore reformulate the problem as a special graph problem and give a tight integer linear programming model. We then present algorithm engineering techniques to handle the huge integer linear programs of real-life distance matrix alignment problems. Applying these techniques, we can compute provably optimal dali alignments for the very first time.  相似文献   

5.
Imposing constraints is a way to incorporate information into the sequence alignment procedure. In this paper, a general model for constrained alignment is proposed so that analyses admitted are more flexible and that different pattern definitions can be treated in a simple unified way. We give a polynomial time algorithm for pairwise constrained alignment for the generalized formulation, and prove the inapproximability of the problem when the number of sequences can be arbitrary. In addition, previous works deal only with the case that the patterns in the constraint have to occur in the output alignment in the same order as that specified by the input. It is of both theoretical and practical interest to investigate the case when the order is no longer limited. We show that the problem is not approximable even when the number of sequences is two. We also give the NPO-completeness results for the problems with bounds imposed on the objective function value.  相似文献   

6.
利用剖面隐马氏模型获得多序列联配,一般需要经过初始化、训练、联配三个过程.然而,目前广泛采用的Baum—welch训练算法假设各条可观察序列互相独立,这与实际情况有所不符.本文对剖面隐马氏模型,给出可观察序列在互相不独立情况下的改进Baum—wlelch算法,在可观察序列两种特殊情况下(互相独立和一致依赖),得到了改进算法的具体表达式,讨论了一般情况下权重的选取方法.最后通过一个具体的蛋白质家族的多序列联配来说明改进算法的效果.  相似文献   

7.
In this paper, we present a novel graph-theoretical approach for representing a wide variety of sequence analysis problems within a single model. The model allows incorporation of the operations “insertion”, “deletion”, and “substitution”, and various parameters such as relative distances and weights. Conceptually, we refer the problem as the minimum weight common mutated sequence (MWCMS) problem. The MWCMS model has many applications including multiple sequence alignment problem, the phylogenetic analysis, the DNA sequencing problem, and sequence comparison problem, which encompass a core set of very difficult problems in computational biology. Thus the model presented in this paper lays out a mathematical modeling framework that allows one to investigate theoretical and computational issues, and to forge new advances for these distinct, but related problems. Through the introduction of supernodes, and the multi-layer supergraph, we proved that MWCMS is -complete. Furthermore, it was shown that a conflict graph derived from the multi-layer supergraph has the property that a solution to the associated node-packing problem of the conflict graph corresponds to a solution of the MWCMS problem. In this case, we proved that when the number of input sequences is a constant, MWCMS is polynomial-time solvable. We also demonstrated that some well-known combinatorial problems can be viewed as special cases of the MWCMS problem. In particular, we presented theoretical results implied by the MWCMS theory for the minimum weight supersequence problem, the minimum weight superstring problem, and the longest common subsequence problem. Two integer programming formulations were presented and a simple yet elegant decomposition heuristic was introduced. The integer programming instances have proven to be computationally intensive. Consequently, research involving simultaneous column and row generation and parallel computing will be explored. The heuristic algorithm, introduced herein for multiple sequence alignment, overcomes the order-dependent drawbacks of many of the existing algorithms, and is capable of returning good sequence alignments within reasonable computational time. It is able to return the optimal alignment for multiple sequences of length less than 1500 base pairs within 30 minutes. Its algorithmic decomposition nature lends itself naturally for parallel distributed computing, and we continue to explore its flexibility and scalability in a massive parallel environment.  相似文献   

8.
The puzzle-assembly problem has many application areas such as restoration and reconstruction of archeological findings, repairing of broken objects, solving jigsaw type puzzles, molecular docking problem, etc. The puzzle pieces usually include not only geometrical shape information but also visual information such as texture, color, and continuity of lines. This paper presents a new approach to the puzzle-assembly problem that is based on using textural features and geometrical constraints. The texture of a band outside the border of pieces is predicted by inpainting and texture synthesis methods. Feature values are derived from these original and predicted images of pieces. An affinity measure of corresponding pieces is defined and alignment of the puzzle pieces is formulated as an optimization problem where the optimum assembly of the pieces is achieved by maximizing the total affinity measure. A Fast Fourier Transform based image registration technique is used to speed up the alignment of the pieces. Experimental results are presented on real and artificial data sets.  相似文献   

9.
10.
This paper is concerned with automated classification of Combinatorial Optimization Problem instances for instance-specific parameter tuning purpose. We propose the CluPaTra Framework, a generic approach to CLUster instances based on similar PAtterns according to search TRAjectories and apply it on parameter tuning. The key idea is to use the search trajectory as a generic feature for clustering problem instances. The advantage of using search trajectory is that it can be obtained from any local-search based algorithm with small additional computation time. We explore and compare two different search trajectory representations, two sequence alignment techniques (to calculate similarities) as well as two well-known clustering methods. We report experiment results on two classical problems: Travelling Salesman Problem and Quadratic Assignment Problem and industrial case study.  相似文献   

11.
In the segment-based approach to sequence alignment, nucleic acid, and protein sequence alignments are constructed from fragments, i.e., from pairs of ungapped segments of the input sequences. Given a set F of candidate fragments and a weighting function w : FR+0, the score of an alignment is defined as the sum of weights of the fragments it consists of, and the optimization problem is to find a consistent collection of pairwise disjoint fragments with maximum sum of weights. Herein, a sparse dynamic programming algorithm is described that solves the pairwise segment-alignment problem in O(L + Nmax) space where L is the maximum length of the input sequences while Nmax ≤ #F holds. With a recently introduced weighting function w, small sets F of candidate fragments are sufficient to obtain alignments of high quality. As a result, the proposed algorithm runs in essentially linear space.  相似文献   

12.
In generalized tree alignment problem, we are given a set S of k biologically related sequences and we are interested in a minimum cost evolutionary tree for S. In many instances of this problem partial phylogenetic tree for S is known. In such instances, we would like to make use of this knowledge to restrict the tree topologies that we consider and construct a biologically relevant minimum cost evolutionary tree. So, we propose the following natural generalization of the generalized tree alignment problem, a problem known to be MAX-SNP Hard, stated as follows:
Constrained Generalized Tree Alignment Problem [S. Divakaran, Algorithms and heuristics for constrained generalized alignment problem, DIMACS Technical Report 2007-21, 2007]: Given a set S of k related sequences and a phylogenetic forest comprising of node-disjoint phylogenetic trees that specify the topological constraints that an evolutionary tree of S needs to satisfy, construct a minimum cost evolutionary tree for S.
In this paper, we present constant approximation algorithms for the constrained generalized tree alignment problem. For the generalized tree alignment problem, a special case of this problem, our algorithms provide a guaranteed error bound of 2−2/k.  相似文献   

13.
We study an integro-differential equation modeling angular alignment of interacting bundles of cells or filaments. A bifurcation analysis of the related stationary problem was done by Geigant and Stoll in [E. Geigant, M. Stoll, Bifurcation analysis of an orientational aggregation model, J. Math. Biol. 46 (6) (2003) 537-563]. Here we analyze the time-dependent problem and prove that the type of alignment (one- or multi-directional) depends on the initial distribution, the interaction potential, and the preferred optimal orientation of the bundles of cells or filaments. Our main technical tool is the analysis of the evolution of suitable functionals for the cell density, which allows to also specify the direction(s) where the final alignment takes place.  相似文献   

14.
Detecting similarity between non-rigid shapes is one of the fundamental problems in computer vision. In order to measure the similarity the shapes must first be aligned. As opposite to rigid alignment that can be parameterized using a small number of unknowns representing rotations, reflections and translations, non-rigid alignment is not easily parameterized. Majority of the methods addressing this problem boil down to a minimization of a certain distortion measure. The complexity of a matching process is exponential by nature, but it can be heuristically reduced to a quadratic or even linear for shapes which are smooth two-manifolds. Here we model the shapes using both local and global structures, employ these to construct a quadratic dissimilarity measure, and provide a hierarchical framework for minimizing it to obtain sparse set of corresponding points. These correspondences may serve as an initialization for dense linear correspondence search.  相似文献   

15.
Protein structure alignment is one of the most important computational problems in molecular biology. From the viewpoint of computational complexity, a pairwise structure alignment is a NP-hard problem. In this paper, based on the discrepancy of two proteins, we define the structure alignment as a mixed integer-programming (MIP) problem with the simpler form and prove the existence of optimal solution. The optimal alignment is achieved by incorporating improved complete information set method used to modify the score matrix into iterative double dynamic programming algorithm. Convergence of algorithm is proved. A number of benchmark examples are tested. The results show that our model and approach are general and improve computational efficiency as well as quality of the structure alignment.  相似文献   

16.
In the last years many techniques in bioinformatics have been developed for the central and complex problem of optimally aligning biological sequences. In this paper we propose a new optimization approach based on DC (Difference of Convex functions) programming and DC Algorithm (DCA) for the multiple sequence alignment in its equivalent binary linear program, called “Maximum Weight Trace” problem. This problem is beforehand recast as a polyhedral DC program with the help of exact penalty techniques in DC programming. Our customized DCA, requiring solution of a few linear programs, is original because it converges after finitely many iterations to a binary solution while it works in a continuous domain. To scale-up large-scale (MSA), a constraint generation technique is introduced in DCA. Preliminary computational experiments on benchmark data show the efficiency of the proposed algorithm DCAMSA, which generally outperforms some standard algorithms.  相似文献   

17.
该文主要讨论一维空间中一类辐射流体力学方程组的激波. 由Rankine-Hugoniot条件及熵条件得此问题可表述为关于辐射流体力学方程组带自由边界的初边值问题. 首先通过变量代换, 将其自由边界转换为固定边界, 然后研究关于此非线性方程组的一个初边值问题解的存在唯一性. 为此先构造了此问题的一个近似解, 然后分别通过Picard迭代与Newton迭代对此非线性问题构造近似解序列. 通过一系列估计与紧性理论得到此近似解序列的收敛性, 其极限即为原辐射热力学方程组的一个激波.  相似文献   

18.
This paper considers an infinite-time optimal damping control problem for a class of nonlinear systems with sinusoidal disturbances. A successive approximation approach (SAA) is applied to design feedforward and feedback optimal controllers. By using the SAA, the original optimal control problem is transformed into a sequence of nonhomogeneous linear two-point boundary value (TPBV) problems. The existence and uniqueness of the optimal control law are proved. The optimal control law is derived from a Riccati equation, matrix equations and an adjoint vector sequence, which consists of accurate linear feedforward and feedback terms and a nonlinear compensation term. And the nonlinear compensation term is the limit of the adjoint vector sequence. By using a finite term of the adjoint vector sequence, we can get an approximate optimal control law. A numerical example shows that the algorithm is effective and robust with respect to sinusoidal disturbances.  相似文献   

19.
Sequence alignment has been studied for some time and there is a developed theory of alignment statistics. DNA restriction maps are aligned by comparing locations of cut sites or restriction fragment lengths. The statistical theory of their alignments is less well developed than that of sequence alignment. In this paper we estimate the probability that two random restriction maps match under certain matching model.  相似文献   

20.
In this paper, in order to obtain some existence results about solutions of the augmented Lagrangian problem for a constrained problem in which the objective function and constraint functions are noncoercive, we construct a new augmented Lagrangian function by using an auxiliary function. We establish a zero duality gap result and a sufficient condition of an exact penalization representation for the constrained problem without the coercive or level-bounded assumption on the objective function and constraint functions. By assuming that the sequence of multipliers is bounded, we obtain the existence of a global minimum and an asymptotically minimizing sequence for the constrained optimization problem.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号