应用CARS和SPA算法对草莓SSC含量NIR光谱预测模型中变量及样本筛选 Near-Infrared Spectra Combining with CARS and SPA Algorithms to Screen the Variables and Samples for Quantitatively Determining the Soluble Solids Content in Strawberry期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

应用CARS和SPA算法对草莓SSC含量NIR光谱预测模型中变量及样本筛选

引用本文：	李江波,郭志明,黄文倩,张保华,赵春江.应用CARS和SPA算法对草莓SSC含量NIR光谱预测模型中变量及样本筛选[J].光谱学与光谱分析,2015,35(2):372-378.

作者姓名：	李江波郭志明黄文倩张保华赵春江

作者单位：	北京市农林科学院北京农业智能装备技术研究中心，北京 100097

基金项目：	北京市博士后科研活动经费，中国博士后科学基金项目，国家自然科学基金项目

摘要：	采用光谱技术对水果进行定量或定性分析，如何获得一个简单、有效的校正模型对后续模型的应用和维护都非常关键。以草莓内部品质近红外光谱预测为例，从关键变量和特征样本优选两方面进行研究。采用竞争性自适应重加权CARS算法对光谱变量进行初次选择，随后采用连续投影算法SPA对校正集样本进行优选，获得98个特征样本，针对优选后的变量/样本子集利用SPA算法作二次关键变量提取，获得25个关键变量。为了验证CARS算法的性能，蒙特卡罗无信息变量消除MC-UVE和连续投影算法SPA用于比较研究。CARS算法在消除无信息变量的同时可以对共线性信息进行去除。同样，为了评估SPA算法在特征样本选择中的性能，经典的Kennard-Stone算法也用于比较分析。SPA算法能够用于校正集特征样本的优选。针对最终优选后的变量/样本(25/98)子集建立PLS和MLR模型对草莓内部可溶性固形物含量SSC含量进行定量预测。结果表明，两个模型利用原始变量/样本的0.59%/65.33%的信息均能够获得比基于原始变量/样本所建模型更好的性能，且MLR模型比PLS模型性能略优，r2_pre，RMSEP和RPD分别为0.909 7，0.348 4和3.327 8。
关键词：	变量筛选样本筛选近红外光谱草莓可溶性固形物
收稿时间：	2013-11-02
Near-Infrared Spectra Combining with CARS and SPA Algorithms to Screen the Variables and Samples for Quantitatively Determining the Soluble Solids Content in Strawberry

LI Jiang-bo,GUO Zhi-ming,HUANG Wen-qian,ZHANG Bao-hua,ZHAO Chun-jiang.Near-Infrared Spectra Combining with CARS and SPA Algorithms to Screen the Variables and Samples for Quantitatively Determining the Soluble Solids Content in Strawberry[J].Spectroscopy and Spectral Analysis,2015,35(2):372-378.

Authors:	LI Jiang-bo GUO Zhi-ming HUANG Wen-qian ZHANG Bao-hua ZHAO Chun-jiang

Institution:	Beijing Research Center of Intelligent Equipment for Agriculture, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China

Abstract:	In using spectroscopy to quantitatively or qualitatively analyze the quality of fruit, how to obtain a simple and effective correction model is very critical for the application and maintenance of the developed model. Strawberry as the research object, this research mainly focused on selecting the key variables and characteristic samples for quantitatively determining the soluble solids content. Competitive adaptive reweighted sampling (CARS) algorithm was firstly proposed to select the spectra variables. Then, Samples of correction set were selected by successive projections algorithm (SPA), and 98 characteristic samples were obtained. Next, based on the selected variables and characteristic samples, the second variable selection was performed by using SPA method. 25 key variables were obtained. In order to verify the performance of the proposed CARS algorithm, variable selection algorithms including Monte Carlo-uninformative variable elimination (MC-UVE) and SPA were used as the comparison algorithms. Results showed that CARS algorithm could eliminate uninformative variables and remove the collinearity information at the same time. Similarly, in order to assess the performance of the proposed SPA algorithm for selecting the characteristic samples, SPA algorithm was compared with classical Kennard-Stone algorithm. Results showed that SPA algorithm could be used for selection of the characteristic samples in the calibration set. Finally, PLS and MLR model for quantitatively predicting the SSC (soluble solids content) in the strawberry were proposed based on the variables/samples subset (25/98), respectively. Results show that models built by using the 0.59% and 65.33% information of original variables and samples could obtain better performance than using the ones obtained by using all information of the original variables and samples. MLR model was the best with R2_pre=0.909 7, RMSEP=0.348 4 and RPD=3.327 8.

Keywords:	Variable selection Sample selection Near-infrared spectra Strawberry Soluble solids content
本文献已被 CNKI 万方数据等数据库收录！
	点击此处可从《光谱学与光谱分析》浏览原始摘要信息
	点击此处可从《光谱学与光谱分析》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏