首页 | 本学科首页   官方微博 | 高级检索  
     检索      

并行MapReduce PLS算法及其在光谱分析中的应用
引用本文:杨辉华,杜玲玲,李灵巧,唐天彪,郭拓,梁琼麟,王义明,罗国安.并行MapReduce PLS算法及其在光谱分析中的应用[J].光谱学与光谱分析,2012,32(9):2399-2404.
作者姓名:杨辉华  杜玲玲  李灵巧  唐天彪  郭拓  梁琼麟  王义明  罗国安
作者单位:1. 桂林电子科技大学电子工程与自动化学院,广西桂林,541004
2. 桂林电子科技大学计算机科学与工程学院,广西桂林,541004
3. 清华大学分析中心,北京,100084
基金项目:国家自然科学基金项目,广西自然科学基金项目,广西高等学校优秀人才资助计划项目,广西可信软件重点实验室开放基金项目,广西研究生教育创新计划项目
摘    要:偏最小二乘(PLS)算法是常用的光谱建模算法,然而对于海量光谱处理情形,在单台计算机上建模及优化时间开销很大。基于MapReduce编程模式,提出了并行MapReduce PLS回归算法,包括并行数据标准化和并行主成分提取两个过程。在多台普通计算机上搭建Hadoop云计算集群平台,以近红外光谱处理为例,开展了算法验证实验。实验结果表明,基于MapReduce编程模式的并行PLS算法对海量近红外光谱数据集进行回归建模时,能有效提高建模速度,随计算机台数的增多可得到接近线性的加速比,并具有良好的扩展性。

关 键 词:并行偏最小二乘  近红外光谱  MapReduce  并行计算  Hadoop  云计算

Parallel PLS Aigorithm Using MapReduce and Its Aplication in Spectral Modeling
YANG Hui-hua , DU Ling-ling , LI Ling-qiao , TANG Tian-biao , GUO Tuo , LIANG Qiong-lin , WANG Yi-ming , LUO Guo-an.Parallel PLS Aigorithm Using MapReduce and Its Aplication in Spectral Modeling[J].Spectroscopy and Spectral Analysis,2012,32(9):2399-2404.
Authors:YANG Hui-hua  DU Ling-ling  LI Ling-qiao  TANG Tian-biao  GUO Tuo  LIANG Qiong-lin  WANG Yi-ming  LUO Guo-an
Institution:1.School of Electric Engineering and Automation,Guilin University of Electronic Technology,Guilin 541004,China 2.School of Computer Science and Engineering,Guilin University of Electronic Technology,Guilin 541004,China 3.Analysis Center,Tsinghua University,Beijing 100084,China
Abstract:Partial least squares(PLS) has been widely used in spectral analysis and modeling,and it is computation-intensive and time-demanding when dealing with massive data.To solve this problem effectively,a novel parallel PLS using MapReduce is proposed,which consists of two procedures,the parallelization of data standardizing and the parallelization of principal component computing.Using NIR spectral modeling as an example,experiments were conducted on a Hadoop cluster,which is a collection of ordinary computers.The experimental results demonstrate that the parallel PLS algorithm proposed can handle massive spectra,can significantly cut down the modeling time,and gains a basically linear speedup,and can be easily scaled up.
Keywords:Parallel partial least squares  Near infrared spctrum  MapReduce  Parallel computing  Hadoop  Cloud computing
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号