首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
系统发育学研究物种之间的进化关系,其核苷酸替代模型通常假设序列进化没有数据的缺损和删失,而现实中这个假设条件是很难满足的.针对这种事实,本文将运用EM算法对存在插入或缺失但序列长度假设不变的观测序列构建系统发育树进行参数估计,为含缺损数据序列构建良好的系统发育树作铺垫.重点在于运用EM算法做Jukes-Cantor模型、Kimura模型下含缺损数据的DNA序列构建有根树或无根树最佳分枝长度等的参数估计.  相似文献   

2.
In this paper, we compare the accuracy of four string distances on complete genomes to reconstruct phylogenies using simulated and real biological data. These distances are based on common words shared by raw genomic sequences and do not require preliminary processing steps such as gene identification or sequence alignment. Moreover, they are computable in linear time. The first distance is based on Maximum Significant Matches (MSM). The second is computed from the frequencies of all the words of length k (KW). The third distance is based on the Average length of maximum Common Substrings at any position (ACS). The last one is based on the Ziv–Lempel compression algorithm (ZL). We describe a simulation process of evolution to generate a set of sequences having evolved according to a random tree topology T. This process allows both base substitution and fragment insertion/deletion, including horizontal transfers. The distances between the generated sequences are computed using the four formulas and the corresponding trees T′ are reconstructed using Neighbor-Joining. T and T′ are compared according to topological criteria. These comparisons show that the MSM distance outperforms the others whatever the parameters used to generate sequences. Finally, we test the MSM and KW distances on real biological data (i.e. prokaryotic complete genomes) and we compare the NJ trees to a Maximum Likelihood 16S + 23S RNA tree. We show that the MSM distance provides accurate results to study intra-phylum relationships, much better than those given by KW.  相似文献   

3.
A method for a computer search of primes in arithmetic progression is described. Six sequences of length 16 and 21 sequences of length 15 were found as well as numerous sequences of lengths 13 and 14.  相似文献   

4.
Goodman证明了对两符号等长代换系统, 如果代换规则中0和1对应的词只有一个位置不同,那么对应的代换系统为null的, 即此系统沿着任意正整数序列的序列熵均为0. 在本文中, 我们针对系统的结构特征, 通过考察因子系统, 给出了此经典结果的另外一种证明. 同时, 对此类代换系统沿着给定序列的复杂性, 我们得到了比Goodman更为精确的估计.  相似文献   

5.
In this paper, we study the combinatorial properties ol words m chscrete dynamical systems from antisymmetric cubic maps. We also discuss the relationship of primitive kneading sequences of length n and period-doubling kneading sequences of length 2n, and then determine the number of all kneading sequences of length n.  相似文献   

6.
We prove the Hölder condition for the sums of series obtained by the substitution of uniformly distributed sequences for random variables in the Fourier expansion of a wiener process.  相似文献   

7.
Abstract

In this paper, we provide the complete characterization of integer sequences that are characteristic sequences for general non-associative algebras, i.e., we determine the set of combinatorial properties which hold for all characteristic sequences and construct corresponding algebras for integer sequences satisfying them. The obtained information on characteristic sequences is then applied to investigate the realizability problem for the length function. In particular, we determine certain segment of values which are not realizable as values of the length function.  相似文献   

8.
张国华  匡锐  叶向东 《数学学报》2005,48(5):833-840
系统称为null的,如果对任意序列,它的序列熵为零.双符号等长代换及其对应的代换极小系统可分成三类:有限的、离散的和连续的.容易看出离散的代换极小系统是null的,Goodman证明了连续的代换极小系统不是null的.本文将完全刻画所有的双符号等长代换极小系统的序列墒.  相似文献   

9.
A time series model based on the global structure of the complete genome is proposed. Three kinds of length sequences of the complete genome are considered. The correlation dimensions and Hurst exponents of the length sequences are calculated. Using these two exponents, some interesting results related to the problem of classification and evolution relationship of bacteria are obtained.  相似文献   

10.
For the construction of multiwavelets, there are no unified, explicit formulas as that in the scalar case available so far. In this paper, by studying the relationship between length 3 and length 4 filter sequences of orthogonal multiwavelets based on the result of Chui for the construction of length 3 orthogonal multiwavelets, a set of explicit formulas is given for the construction of length 4 high-pass filter sequence. Examples demonstrate that our proposed approach not only provides explicit formulas for the construction of length 4 high-pass filter sequence with multiplicity r, but also yields a set of new low-pass and high-pass filter sequences with length 3 and multiplicity 2r.  相似文献   

11.
Bioinformatics, the discipline which studies the computational problems arising from molecular biology, poses many interesting problems to the string searching community. We will describe two problems arising from Bioinformatics, their preliminary solutions, and the more general problem that they pose. The first problem is searching for α-helices in protein sequences. This particular instance of the search is based on matching of hydrophobicity/hydrophilicity. We find an algorithm which is linear in the sequence length for fixed helix length and is O(nlogn) for any helix length. The second problem is on matching probabilistic sequences against sequences or against other probabilistic sequences. In both cases we derive efficient formulas to compute scores according to a Markovian model of evolution.  相似文献   

12.
Very odd sequences were introduced in 1973 by Pelikán who conjectured that there were none of length 5. This conjecture was disproved first by MacWilliams and Odlyzko [17] in 1977 and then by two different sets of authors in 1992 [1], 1995 [9]. We give connections with duadic codes, cyclic difference sets, levels (Stufen) of cyclotomic fields, and derive some new asymptotic results on the length of very odd sequences and the number of such sequences of a given length.  相似文献   

13.
采用部分可观Petri网的故障诊断方法来解决变电站输电系统中不可观事件和不可观运行状态的故障诊断问题.首先,将系统可观测序列分解为长度为1的基础观测序列,应用线性不等式矩阵计算与基础观测序列相符的点火序列集;然后,基于整数线性规划问题,利用向前向后函数拓宽诊断区间,同时应用参数K限定故障诊断序列长度,通过分析系统可观事件和系统部分可观状态,给出故障诊断结果.最后,构造变电站输电系统的部分可观Petri网模型,应用提出的故障诊断算法对输电系统进行诊断,诊断结果准确给出了故障发生与否及故障发生位置.算法适用于在线故障诊断,计算复杂性线性相关于观序列长度.  相似文献   

14.
考虑了素数阶循环群中的短序列的等价序列,并在某些情况下给出序列的Index值的上界.  相似文献   

15.
The following is proved: (1) There exists an infinite binary sequence having no triple repetitions, and having no repetitions of length 4 or greater. (2) Binary sequences having no triple repetitions and having no repetitions of length 3 or greater are finite. (3) Infinite binary sequences that have no identical over-lapping blocks, have arbitrarily long identical adjacent blocks.  相似文献   

16.
主要利用蛋白质统计信息和氨基酸与疏水级映射关系,提出一种基于亲疏水性的替代矩阵HB62,解决蛋白质疏水级序列相似性计算问题.采用CB513数据集,分别利用Blosum62和HB62计算蛋白质间的相似程度,结果显示,两种方法计算结果具有一致性,验证了HB62的正确性与有效性.HB62的设计,极大地简化了蛋白质疏水级序列相似性计算问题,有效地降低预测算法复杂度,提高预测准确率,推动蛋白质亲疏水性的相关理论的发展.  相似文献   

17.
Sequences of integers defined by a quadratic congruential formula are divided into non-overlapping subsequences of length d. The structure of the set of the resulting points in the d-dimensional Euclidean space Rd is studied. The analysis is restricted to the case of sequences with maximal period length since such sequences are of special interest in connection with pseudo random number generation.  相似文献   

18.
The remainders about two intervals of equal length are related by the Bohl lemma when the sequences are the () ones. In this paper, we prove a similar result for the generalised Van Der Corput sequences, and, as a consequence, we get the asymptotic behaviour of the remainder about any interval, by means of its length only.  相似文献   

19.
The pseudo-randomness and complexity of binary sequences generated by chaotic systems are investigated in this paper. These chaotic binary sequences can have the same pseudo-randomness and complexity as the chaotic real sequences that are transformed into them by the use of Kohda’s quantification algorithm. The statistical test, correlation function, spectral analysis, Lempel–Ziv complexity and approximate entropy are regarded as quantitative measures to characterize the pseudo-randomness and complexity of these binary sequences. The experimental results show the finite binary sequences generated by the chaotic systems have good properties with the pseudo-randomness and complexity of sequences. However, the pseudo-randomness and complexity of sequence are not added with the increase of sequence length. On the contrary, they steadily decrease with the increase of sequence length in the criterion of approximate entropy and statistical test. The constraint of computational precision is a fundamental reason resulting in the problem. So only the shorter binary sequences generated by the chaotic systems are suitable for modern cryptography without other way of adding sequence complexity in the existing computer system.  相似文献   

20.
This paper concerns the longest common subsequence (LCS) shared by two sequences (or strings) of length N, whose elements are chosen at random from a finite alphabet. The exact distribution and the expected value of the length of the LCS, k say, between two random sequences is still an open problem in applied probability. While the expected value E(N) of the length of the LCS of two random strings is known to lie within certain limits, the exact value of E(N) and the exact distribution are unknown. In this paper, we calculate the length of the LCS for all possible pairs of binary sequences from N=1 to 14. The length of the LCS and the Hamming distance are represented in color on two all-against-all arrays. An iterative approach is then introduced in which we determine the pairs of sequences whose LCS lengths increased by one upon the addition of one letter to each sequence. The pairs whose score did increase are shown in black and white on an array, which has an interesting fractal-like structure. As the sequence length increases, R(N) (the proportion of sequences whose score increased) approaches the Chvátal–Sankoff constant a c (the proportionality constant for the linear growth of the expected length of the LCS with sequence length). We show that R(N) is converging more rapidly to a c than E(N)/N.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号