首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Computing the distribution of the Robinson-Foulds distance
Institution:Simon Fraser University, Department of Computing Science, add8888 University Avenue, Burnaby, BC V5A 1S6, Canada
Abstract:With the exponential growth of genome databases, the importance of phylogenetics has increased dramatically over the past years. Studying phylogenetic trees enables us not only to understand how genes, genomes, and species evolve, but also helps us predict how they might change in future. One of the crucial aspects of phylogenetics is the comparison of two or more phylogenetic trees. There are different metrics for computing the dissimilarity between a pair of trees. The Robinson-Foulds (RF) distance is one of the widely used metrics on the space of labeled trees. The distribution of the RF distance from a given tree has been studied before, but the fastest known algorithm for computing this distribution is a slow, albeit polynomial-time, O(l5) algorithm. In this paper, we modify the dynamic programming algorithm for computing the distribution of this distance for a given tree by leveraging the number-theoretic transform (NTT), and improve the running time from O(l5) to O(l3 log l), where l is the number of tips of the tree. In addition to its practical usefulness, our method represents a theoretical novelty, as it is, to our knowledge, one of the rare applications of the number-theoretic transform for solving a computational biology problem.
Keywords:RF distance  Phylogenetics  Number-theoretic transform  Convolution  Null distribution
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号