首页 | 本学科首页   官方微博 | 高级检索  
     检索      


An exact and polynomial distance-based algorithm to reconstruct single copy tandem duplication trees
Authors:Olivier Elemento  Olivier Gascuel  
Institution:

aLewis-Sigler Institute for Integrative Genomics, Princeton University, 08544, Princeton, NJ, USA

bEquipe Méthodes et Algorithmes pour la Bioinformatique, LIRMM, 161 rue Ada, 34392, Montpellier, France

Abstract:The problem of reconstructing the duplication tree of a set of tandemly repeated sequences which are supposed to have arisen by unequal recombination, was first introduced by Fitch (1977), and has recently received a lot of attention. In this paper, we place ourselves in a distance framework and deal with the restricted problem of reconstructing single copy duplication trees. We describe an exact and polynomial distance based algorithm for solving this problem, the parsimony version of which has previously been shown to be NP-hard (like most evolutionary tree reconstruction problems). This algorithm is based on the minimum evolution principle, and thus involves selecting the shortest tree as being the correct duplication tree. After presenting the underlying mathematical concepts behind the minimum evolution principle, and some of its benefits (such as statistical consistency), we provide a new recurrence formula to estimate the tree length using ordinary least-squares, given a matrix of pairwise distances between the copies. We then show how this formula naturally forms the dynamic programming framework on which our algorithm is based, and provide an implementation in O(n3) time and O(n2) space, where n is the number of copies.
Keywords:Evolution reconstructions  Phylogenetic trees  Duplication trees  Tandemly repeated sequences  Gene families  Distance-based methods  Least-squares  Minimum evolution principle  Exact algorithm  Dynamic programming
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号