首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于CART回归树的LIBS特征变量选择方法研究
作者单位:长春工业大学电气与电子工程学院 ,吉林 长春 130012;长春工业大学机电工程学院 ,吉林 长春 130012;吉林建筑科技学院 ,吉林 长春 130012
基金项目:国家重大科学仪器开发专项(2014YQ12035104),吉林省科技厅项目(20180414017GH, 20200403008SF),吉林省发展改革委项目(2018C034-3)资助
摘    要:激光诱导击穿光谱技术(LIBS)用于检测时,由于谱线多且复杂,存在许多冗余的信息,这些都会对定量分析造成影响。因此,提取有效的特征变量在LIBS的定量分析中具有非常重要的意义。对CaCl2溶液中的Ca元素进行光谱特征选择方法分析,对比单变量模型、偏最小二乘回归和CART回归树定标模型的准确度和稳定性。针对水体表面的波动性较大,光谱稳定性差,同时光谱受基体效应和自吸收效应影响等问题,首先采用单变量模型得到的拟合系数(R2)仅有0.933 2,训练均方根误差(RMSEC)、预测均方根误差(RMSEP)和平均相对误差(ARE)分别为0.019 2 Wt%,0.017 7 Wt%和11.604%。经偏最小二乘回归优化后,模型R2提高到0.975 3,RMSEC,RMSEP和ARE分别降低到0.010 8 Wt%,0.013 Wt%和7.49%。为了进一步提高定量分析的准确度,建立CART回归树定标模型。该方法在构建树模型时,通过平方误差最小化准则,从复杂的光谱信息中选取最优的特征变量组合做分类决策,从而建立Ca元素的定标曲线。通过CART回归树的变量选择,特征变量个数从100个减少到6个,变量的压缩率达到了94%,显著降低了无关谱线的干扰,回归树模型的相关系数R2,RMSEC,RMSEP和ARE分别为0.997 5,0.003 5 Wt%,0.006 1 Wt%和2.500%。相较于传统的单变量模型与偏最小二乘回归,CART回归树模型具有更高的精度、更小的误差。通过对特征变量的有效筛选,剔除无关信号的干扰,显著降低了基体效应和自吸收效应对LIBS定量分析的影响,提高了定量分析的准确度和稳定性。

关 键 词:激光诱导击穿光谱  特征变量选择  CART回归树  定量分析
收稿时间:2020-10-10

Research on Selection Method of LIBS Feature Variables Based on CART Regression Tree
Authors:YOU Wen  XIA Yang-peng  HUANG Yu-tao  LIN Jing-jun  LIN Xiao-mei
Institution:1. Department of Electronics and Electrical Engineering, Changchun University of Technology, Changchun 130012, China 2. Department of Mechanical and Electrical Engineering, Changchun University of Technology, Changchun 130012, China 3. Jilin University of Architecture and Technology,Changchun 130012, China
Abstract:When laser induced breakdown spectroscopy (LIBS) is used for detection, due to the many and complex spectral lines, there are much redundant information, which will affect the quantitative analysis. Therefore, extracting effective feature variables is of great significance in the quantitative analysis of LIBS. In this paper, the method of selecting the spectral characteristics of the Ca element in the CaCl2 solution was analyzed, and the accuracy and stability of the univariate model, partial least square regression and CART regression tree calibration model were compared. In view of the large volatility of the surface of the water body, the poor spectral stability, and the fact that the spectrum is affected by the matrix effect and the self-absorption effect, the fitting coefficient (R2) obtained by the univariate model is only 0.933 2, and the training root mean square error (RMSEC), prediction root mean square error (RMSEP) and average relative error (ARE) are 0.019 2 Wt%, 0.017 7 Wt% and 11.604% respectively. After partial least squares regression optimization, the model R2 is increased to 0.975 3, and RMSEC, RMSEP and ARE are reduced to 0.010 8 Wt%, 0.013 Wt% and 7.49%, respectively. Although the model’s accuracy has been improved, it is still difficult to meet the analysis requirements. In order to further improve the accuracy of quantitative analysis, a CART regression tree calibration model was established. When constructing the tree model, this method uses the square error minimization criterion to select the optimal combination of characteristic variables from the complex spectral information to make classification decisions, thereby establishing the calibration curve of the Ca element. Through the variable selection of the CART regression tree, the number of characteristic variables is reduced from 100 to 6, and the compression rate of variables reaches 94%, which significantly reduces the interference of irrelevant spectral lines. The correlation coefficients of the regression tree model are R2, RMSEC, RMSEP and ARE is 0.997 5, 0.003 5 Wt%, 0.006 1 Wt% and 2.500%, respectively. Compared with the traditional univariate and partial least square regression, the CART regression tree model has higher accuracy and lower error. Through effective screening of characteristic variables, this paper eliminates the interference of irrelevant signals, significantly reduces the influence of matrix effect and self-absorption effect on LIBS quantitative analysis, and improves the accuracy and stability of quantitative analysis.
Keywords:Laser-induced breakdown spectroscopy  Feature variable selection  CART regression tree  Quantitative analysis  
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《光谱学与光谱分析》浏览原始摘要信息
点击此处可从《光谱学与光谱分析》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号