首页 | 本学科首页   官方微博 | 高级检索  
     

交互式数据迁移系统及其相似检测效率优化
引用本文:陈伟,丁秋林,谢强. 交互式数据迁移系统及其相似检测效率优化[J]. 华南理工大学学报(自然科学版), 2004, 32(2): 58-61
作者姓名:陈伟  丁秋林  谢强
作者单位:南京航空航天大学,计算机应用研究所,江苏,南京,210016;南京航空航天大学,计算机应用研究所,江苏,南京,210016;南京航空航天大学,计算机应用研究所,江苏,南京,210016
摘    要:为保证数据迁移后新系统的数据质量,把数据清理应用于数据迁移之中,提出一种集成数据清理的交互式数据迁移系统,并分析其工作原理.为了提高该系统中相似重复记录的检测效率,在相似重复记录检测中采用长度过滤等方法优化相似检测算法,避免了不必要的编辑距离计算,从而提高了整个数据迁移系统的数据迁移速度.此外,构造了合适的实验环境,作了大量的检测实验,实验结果验证了长度过滤方法的科学性.

关 键 词:数据迁移  数据质量  数据清理  相似检测  长度过滤
文章编号:1000-565X(2004)02-0058-04
修稿时间:2003-06-18

Interactive Data Migration System and Its Approximately-detecting Efficiency Optimization
Chen Wei Ding Qiu-lin Xie Qiang. Interactive Data Migration System and Its Approximately-detecting Efficiency Optimization[J]. Journal of South China University of Technology(Natural Science Edition), 2004, 32(2): 58-61
Authors:Chen Wei Ding Qiu-lin Xie Qiang
Abstract:Data cleaning technology was used to ensure the data quality of the sy stem after data migration. Thus an interactive data migration system combined wi th data cleaning was proposed. The working principle of the system was then anal yzed. By using the length filtration method in approximately-duplicated-record d etection, the approximately-detecting algorithm was optimized to improve the det ection efficiency. As a result, the unnecessary editing distance computation was avoided, which brings about an improvement in the data migration speed of the w hole interactive data migration system. Furthermore, an appropriate experimental environment was created so that a lot of detection experiments could be carried out. Experimental results have proved the rationality of the length filtration method.
Keywords:data migration  data quality  data cleaning  approx imately-detecting  length filtration
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号