首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 593 毫秒
1.
杜世平 《大学数学》2004,20(5):24-29
隐马尔可夫模型 ( HMM)是一个能够通过可观测的数据很好地捕捉真实空间统计性质的随机模型 ,该模型已成功地运用于语音识别 ,目前 HMM已开始应用于生物信息学 ( bioinformatics) ,已在生物序列分析中得到了广泛的应用 .本文首先介绍了 HMM的基本结构 ,然后着重讨论了 HMM在 DNA序列的多重比对 ,基因发现等生物序列分析中的应用  相似文献   

2.
DNA序列比对数目的算法研究   总被引:1,自引:0,他引:1  
徐琛梅  刘晓杰 《大学数学》2008,24(1):100-103
生物序列比对是生物信息学中非常重要的内容.文[1]中作者用差分方程理论给出了求两DNA序列间比对数目的一个计算公式,然而解法较为繁琐.本文将借助于组合数学中母函数这一计数工具给出另一简单、优美的算法,并在此基础上剔除非生物比对,得到进一步的计算公式,这一结果缩小了需要考查的比对范围.  相似文献   

3.
高通量测序技术的飞速发展让生物信息领域迎来了大数据时代。新技术在提供海量生物遗传信息的同时,也给分析这些数据带来了新的挑战。DNA序列比对是信息分析流程中的关键步骤,为后续的变异检测提供序列比对信息。2015“深圳杯”数学建模夏令营B题以DNA序列比对为研究课题,希望参赛学生给出序列快速比对的最佳方案。本文简要点评了各参赛队伍的解答情况,然后介绍了现有DNA序列比对软件中用到的算法和数据结构。  相似文献   

4.
生物序列分析中的若干数学方法   总被引:1,自引:0,他引:1  
生物序列是由4种核苷酸组成的核酸序列和由20种氨基酸组成的蛋白质序列.论文介绍生物序列研究中的计数方法、组分分析方法、隐马尔可夫模型方法以及它们的某些应用.  相似文献   

5.
卢国祥 《应用数学》2012,25(2):389-395
利用组合数学中穷举方法与生成函数方法,得到了Alignment空间中两序列的比对序列数目的一系列表达式,并且对比对序列数目的上下界进行了估计.  相似文献   

6.
图模式挖掘中的子图同构算法   总被引:1,自引:0,他引:1  
图模式挖掘问题在Web挖掘、生物信息学、社会关系等众多领域有广泛的应用,它涉及到子图的搜索以及子图的同构问题.这两个问题都具有相当高的计算复杂度,现有的子图同构问题大多采用最小编码算法,但对无标签图特别是对无标签无向图,该算法效率较底,从而子图的同构成为图模式挖掘问题的一个瓶颈.针对无标签图,以代数理论为基础,分别利用度序列和特征值构造了两种子图同构算法,用于对有向图和无向图的同构判别.最后对2个真实生物网络进行了仿真实验,结果表明,算法的效率优于现有算法.  相似文献   

7.
单核苷酸多态性引起的DNA序列的改变造成了整个生物界染色体基因组的多样性,对SNP的深入研究对于识别人类基因表型和疾病关联具有重要的意义.标签SNP集的选择是生物信息学中的关键问题,少量的标签SNP所代表基因的遗传信息可以大大降低基因分型和全基因组关联研究的成本.本文详细介绍了SNP相关理论以及标签SNP集的选择方法,并针对标签SNP的应用以及未来的研究方向进行了简要分析.  相似文献   

8.
递归关系不仅在数学中有广泛应用,而且在计算机算法设计与分析中也有广泛应用.在讨论两DNA序列间可能出现的比对数目时,得到比对数目满足的递归关系.对这种递归关系进行了推广,得到一类含四个参数的双指标递归关系模型.采用母函数方法,给出了这类递归关系模型的显式解表达式.  相似文献   

9.
涂俐兰 《数学杂志》2006,26(1):67-70
本文研究DNA的两两序列比时,提出了基于快速沃尔什变换的新方法。经过计算模拟分析可知,比对的时间复杂度和空间复杂度明显降低.  相似文献   

10.
隐马模型及其在基因识别中的应用   总被引:2,自引:0,他引:2  
生物信息学是一门新兴交叉学科,隐马模型是广泛用于该学科的数学模型.简要介绍了隐马模型的数学原理,并以大肠杆菌和人的基因识别为例说明了它在基因识别中的应用.  相似文献   

11.
刘治平 《运筹学学报》2021,25(3):173-182
随着高通量技术的发展,越来越多的生物医学组学数据亟需处理与分析,基于运筹优化的生物信息学方法是有效解析高维生物医学数据的重要途径之一。综述了近年来在基因调控网络推断方面的研究进展。针对不同类型的转录组学数据和研究目的,分别建立了相应的基因调控网络推断方法,主要包括先验基因调控网络数据库的建立、基于条件互信息的因果网络推断、基于微分方程的动态基因调控网络推断、转录调控和转录后调控协同作用的网络推断以及基因调控网络活性评价等,并展望了基因调控网络推断的重要研究方向。  相似文献   

12.
基于HMM的CpG岛位置判别   总被引:1,自引:0,他引:1  
隐马尔科夫过程是20世纪70年代提出来的一种统计方法,以前主要用于语音识别,1989年Churchill将其引入计算生物学,目前HMM是生物信息学中应用比较广泛的统计方法。本文对马尔科夫过程和HMM进行了简明扼要的描述,并对其在CpG岛位置判别中的应用做了概括介绍。  相似文献   

13.
We propose a new method to impute missing values in mixed data sets. It is based on a principal component method, the factorial analysis for mixed data, which balances the influence of all the variables that are continuous and categorical in the construction of the principal components. Because the imputation uses the principal axes and components, the prediction of the missing values is based on the similarity between individuals and on the relationships between variables. The properties of the method are illustrated via simulations and the quality of the imputation is assessed using real data sets. The method is compared to a recent method (Stekhoven and Buhlmann Bioinformatics 28:113–118, 2011) based on random forest and shows better performance especially for the imputation of categorical variables and situations with highly linear relationships between continuous variables.  相似文献   

14.
Wavelet-RKHS-based functional statistical classification   总被引:1,自引:0,他引:1  
A functional classification methodology, based on the Reproducing Kernel Hilbert Space (RKHS) theory, is proposed for discrimination of gene expression profiles. The parameter function involved in the definition of the functional logistic regression is univocally and consistently estimated, from the minimization of the penalized negative log-likelihood over a RKHS generated by a suitable wavelet basis. An iterative descendent method, the gradient method, is applied for solving the corresponding minimization problem, i.e., for computing the functional estimate. Temporal gene expression data involved in the yeast cell cycle are classified with the wavelet-RKHS-based discrimination methodology considered. A simulation study is developed for testing the performance of this statistical classification methodology in comparison with other statistical discrimination procedures.  相似文献   

15.
Bioinformatics, the discipline which studies the computational problems arising from molecular biology, poses many interesting problems to the string searching community. We will describe two problems arising from Bioinformatics, their preliminary solutions, and the more general problem that they pose. The first problem is searching for α-helices in protein sequences. This particular instance of the search is based on matching of hydrophobicity/hydrophilicity. We find an algorithm which is linear in the sequence length for fixed helix length and is O(nlogn) for any helix length. The second problem is on matching probabilistic sequences against sequences or against other probabilistic sequences. In both cases we derive efficient formulas to compute scores according to a Markovian model of evolution.  相似文献   

16.
《Discrete Applied Mathematics》2007,155(6-7):806-830
Phylogenetic networks are models of sequence evolution that go beyond trees, allowing biological operations that are not tree-like. One of the most important biological operations is recombination between two sequences. An established problem [J. Hein, Reconstructing evolution of sequences subject to recombination using parsimony, Math. Biosci. 98 (1990) 185–200; J. Hein, A heuristic method to reconstruct the history of sequences subject to recombination, J. Molecular Evoluation 36 (1993) 396–405; Y. Song, J. Hein, Parsimonious reconstruction of sequence evolution and haplotype blocks: finding the minimum number of recombination events, in: Proceedings of 2003 Workshop on Algorithms in Bioinformatics, Berlin, Germany, 2003, Lecture Notes in Computer Science, Springer, Berlin; Y. Song, J. Hein, On the minimum number of recombination events in the evolutionary history of DNA sequences, J. Math. Biol. 48 (2003) 160–186; L. Wang, K. Zhang, L. Zhang, Perfect phylogenetic networks with recombination, J. Comput. Biol. 8 (2001) 69–78; S.R. Myers, R.C. Griffiths, Bounds on the minimum number of recombination events in a sample history, Genetics 163 (2003) 375–394; V. Bafna, V. Bansal, Improved recombination lower bounds for haplotype data, in: Proceedings of RECOMB, 2005; Y. Song, Y. Wu, D. Gusfield, Efficient computation of close lower and upper bounds on the minimum number of needed recombinations in the evolution of biological sequences, Bioinformatics 21 (2005) i413–i422. Bioinformatics (Suppl. 1), Proceedings of ISMB, 2005, D. Gusfield, S. Eddhu, C. Langley, Optimal, efficient reconstruction of phylogenetic networks with constrained recombination, J. Bioinform. Comput. Biol. 2(1) (2004) 173–213; D. Gusfield, Optimal, efficient reconstruction of root-unknown phylogenetic networks with constrained and structured recombination, J. Comput. Systems Sci. 70 (2005) 381–398] is to find a phylogenetic network that derives an input set of sequences, minimizing the number of recombinations used. No efficient, general algorithm is known for this problem. Several papers consider the problem of computing a lower bound on the number of recombinations needed. In this paper we establish a new, efficiently computed lower bound. This result is useful in methods to estimate the number of needed recombinations, and also to prove the optimality of algorithms for constructing phylogenetic networks under certain conditions [D. Gusfield, S. Eddhu, C. Langley, Optimal, efficient reconstruction of phylogenetic networks with constrained recombination, J. Bioinform. Comput. Biol. 2(1) (2004) 173–213; D. Gusfield, Optimal, efficient reconstruction of root-unknown phylogenetic networks with constrained and structured recombination, J. Comput. Systems Sci. 70 (2005) 381–398; D. Gusfield, Optimal, efficient reconstruction of root-unknown phylogenetic networks with constrained recombination, Technical Report, Department of Computer Science, University of California, Davis, CA, 2004]. The lower bound is based on a structural, combinatorial insight, using only the site conflicts and incompatibilities, and hence it is fundamental and applicable to many biological phenomena other than recombination, for example, when gene conversions or recurrent or back mutations or cross-species hybridizations cause the phylogenetic history to deviate from a tree structure. In addition to establishing the bound, we examine its use in more complex lower bound methods, and compare the bounds obtained to those obtained by other established lower bound methods.  相似文献   

17.
Fixing the levels of inputs and process variables in order to meet a required specification of output is a common quality control problem. However difficulties can arise when the output has a number of characteristics and when each of these characteristics has to satisfy a specification. Such a problem was met in a paper manufacturing factory and the problem was solved using Goal Programming (GP). The method can be applied to other process control problems and this paper gives the GP formulation of the general process control problem. The paper also gives details of the case study from which the method was developed.  相似文献   

18.
The maximum clique problem is an important problem in graph theory. Many real-life problems are still being mapped into this problem for their effective solutions. A natural extension of this problem that has emerged very recently in many real-life networks, is its fuzzification. The problem of finding the maximum fuzzy clique has been formalized on fuzzy graphs and subsequently addressed in this paper. It has been shown here that the problem reduces to an unconstrained quadratic 0–1 programming problem. Using a maximum neural network, along with mutation capability of genetic adaptive systems, the reduced problem has been solved. Empirical studies have been done by applying the method on stock flow graphs to identify the collusion set, which contains a group of traders performing unfair trading among themselves. Additionally, it has been applied on a gene co-expression network to find out significant gene modules and on some benchmark graphs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号