Haplotype assembly from aligned weighted SNP fragments |
| |
Authors: | Zhao Yu-Ying Wu Ling-Yun Zhang Ji-Hong Wang Rui-Sheng Zhang Xiang-Sun |
| |
Affiliation: | Institute of Applied Mathematics, Academy of Mathematics and Systems Science, CAS, Beijing 100080, China. zhyuying@amss.ac.cn |
| |
Abstract: | Given an assembled genome of a diploid organism the haplotype assembly problem can be formulated as retrieval of a pair of haplotypes from a set of aligned weighted SNP fragments. Known computational formulations (models) of this problem are minimum letter flips (MLF) and the weighted minimum letter flips (WMLF; Greenberg et al. (INFORMS J. Comput. 2004, 14, 211-213)). In this paper we show that the general WMLF model is NP-hard even for the gapless case. However the algorithmic solutions for selected variants of WMFL can exist and we propose a heuristic algorithm based on a dynamic clustering technique. We also introduce a new formulation of the haplotype assembly problem that we call COMPLETE WMLF (CWMLF). This model and algorithms for its implementation take into account a simultaneous presence of multiple kinds of data errors. Extensive computational experiments indicate that the algorithmic implementations of the CWMLF model achieve higher accuracy of haplotype reconstruction than the WMLF-based algorithms, which in turn appear to be more accurate than those based on MLF. |
| |
Keywords: | SNP Haplotype Assembly Minimum letter flips Dynamic clustering |
本文献已被 ScienceDirect PubMed 等数据库收录! |