首页 | 本学科首页   官方微博 | 高级检索  
     

基于MapReduce和双层倒排网格索引的kNN算法
引用本文:赵敏超,杜震洪,张丰,刘仁义,李荣亚. 基于MapReduce和双层倒排网格索引的kNN算法[J]. 浙江大学学报(理学版), 2014, 41(6): 703-708. DOI: 10.3785/j.issn.1008-9497.2014.06.016
作者姓名:赵敏超  杜震洪  张丰  刘仁义  李荣亚
作者单位:浙江大学浙江省资源与环境信息系统重点实验室;浙江大学地球科学系
基金项目:国家自然科学基金资助项目(41471313,41101356);国家海洋公益性行业科研专项经费资助项目(2015418003,201305012);浙江省科技攻关计划项目(2013C33051);中央高校基本科研业务费专项(2013QNA3023);国家科技基础性工作专项(2012FY112300)
摘    要:随着卫星定位技术和移动互联网技术的飞速发展,地理空间数据来源变得更加多源异构.面对海量地理空间数据,如何快速有效地找到目标周围的兴趣点变得异常重要.依据空间k近邻(kNN)查询算法,提高效率的关键在数据索引和数据块存储结构设计,通过引入云计算的MapReduce编程模型,设计了一种面向MapReduce的地理空间数据双层倒排网格索引,利用CircularTrip算法实现了目标点近邻查询计算,最终获得距离目标点最邻近的数据点集.实验结果表明,该索引方法较单层倒排网格索引下的kNN查询效率有明显提高,且数据量越大效率提升越明显,此法适合大规模并行计算. 

关 键 词:双层倒排网格索引  k最邻近结点算法  云计算  MapReduce  CircularTrip
收稿时间:2013-12-18

Research of kNN algorithm based on MapReduce and double levels of inverted grid index
ZHAO Minchao;DU Zhenhong;ZHANG Feng;LIU Renyi;LI Rongya. Research of kNN algorithm based on MapReduce and double levels of inverted grid index[J]. Journal of Zhejiang University(Sciences Edition), 2014, 41(6): 703-708. DOI: 10.3785/j.issn.1008-9497.2014.06.016
Authors:ZHAO Minchao  DU Zhenhong  ZHANG Feng  LIU Renyi  LI Rongya
Affiliation:ZHAO Minchao;DU Zhenhong;ZHANG Feng;LIU Renyi;LI Rongya;Zhejiang Provincial Key Laboratory of Resources and Environmental Information System,Zhejiang University;Department of Earth Sciences,Zhejiang University;
Abstract:With the development of satellite positioning and mobile internet technology,the geospatial data become more multi-source and heterogeneous,which consequently makes the obtainment of the interesting points around a target remarkably important.Considering that the key to the efficiency of the kNN algorithm lies in the design of data index and the storage structure of data block,we propose a MapReduce-oriented double inverted grid index for geospatial data.The targets neighbor query calculation is implemented based on the CircularTrip algorithm,and finally the nearest point sets are achieved according to the requirements.The results of the following experiments show that the indexing method not only provides a significant improvement in kNN query efficiency,but also has a good performance under a great amount of data,which consequently fits large-scale parallel computing better.
Keywords:double levels of inverted grid index  kNN  cloud computing  MapReduce  CircularTrip
本文献已被 CNKI 维普 等数据库收录!
点击此处可从《浙江大学学报(理学版)》浏览原始摘要信息
点击此处可从《浙江大学学报(理学版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号