首页 | 本学科首页   官方微博 | 高级检索  
     检索      


PECC: Correcting contigs based on paired-end read distribution
Institution:1. College of Veterinary Medicine, Hunan Agricultural University, Changsha City 410128, China;2. Department of Agricultural, Food and Nutritional Science, 4-10 Ag/For Center, University of Alberta, Edmonton T6G2P5, Canada
Abstract:MotivationCheap and fast next generation sequencing (NGS) technologies facilitate research of de novo assembly greatly. The reliability of contigs is critical to construct reliable scaffolding. However, contigs generated from most assemblers contain errors because of the limitation of assembly strategy and computation complexity. Among all these errors, the misassembly error is one of the most harmful types.ResultsIn this paper, we propose a new method named “PECC” to identify and correct misassembly errors in contigs based on the paired-end read distribution. PECC extracts sequence regions with lower paired-end reads supports and verifies them based on the distribution of paired-end supports. To validate the effectiveness of PECC, we applied PECC to the contigs produced by five popular assemblers on four real datasets, and we also carried out experiments to analyze the influences of PECC on scaffolding. The results show that PECC can reduce misassembly errors and improve the performance of scaffolding results, which demonstrate the promising applications of PECC in de novo assembly.
Keywords:Next generation sequencing  De novo assembly  Contigs  Paired-end reads
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号