首页 | 本学科首页   官方微博 | 高级检索  
     

一种结合直接正交信号校正与蒙特卡罗的波长选择方法
引用本文:谢林江,洪明坚,余志荣. 一种结合直接正交信号校正与蒙特卡罗的波长选择方法[J]. 光谱学与光谱分析, 2022, 42(2): 440-445. DOI: 10.3964/j.issn.1000-0593(2022)02-0440-06
作者姓名:谢林江  洪明坚  余志荣
作者单位:重庆大学大数据与软件学院 ,重庆 401331
基金项目:国家重点研发计划项目(2018YFF01011204)资助;
摘    要:在近红外光谱数据分析中,全光谱数据具有波长点多、冗余量大、共线性关系严重的特点,导致了部分波长点对建立校正模型没有积极作用,甚至还会降低模型的预测能力.波长选择被证明是有效避免上述问题的重要方法.针对近红外光谱的特性,提出了一种基于直接正交信号校正(DOSC)与蒙特卡罗方法(Monte Carlo,MC)结合的波长选择...

关 键 词:近红外光谱  波长选择  正交信号校正  蒙特卡罗  偏最小二乘
收稿时间:2021-01-07

A Wavelength Selection Method Combining Direct Orthogonal Signal Correction and Monte Carlo
XIE Lin-jiang,HONG Ming-jian,YU Zhi-rong. A Wavelength Selection Method Combining Direct Orthogonal Signal Correction and Monte Carlo[J]. Spectroscopy and Spectral Analysis, 2022, 42(2): 440-445. DOI: 10.3964/j.issn.1000-0593(2022)02-0440-06
Authors:XIE Lin-jiang  HONG Ming-jian  YU Zhi-rong
Affiliation:School of Big Data & Software Engineering, Chongqing University, Chongqing 401331, China
Abstract:In the analysis of near-infrared spectroscopy data, full-spectrum data has the characteristics of multiple wavelength points, large redundancy, and serious collinearity. This leads to some wavelength points that have no positive effect on the establishment of the correction model and even reduce the model’s predictive ability. Wavelength selection has proven to be an important method to avoid above problems effectively. Aiming at the characteristics of near-infrared spectroscopy, a wavelength selection algorithm based on the combination of Direct Orthogonal Signal Correction (DOSC) and Monte Carlo (MC) is proposed. Unlike most methods of selecting wavelength according to its “importance”, MC-DOSC selects wavelength according to its “unimportance”. The “unimportance” of wavelength is measured by the weight W of DOSC. Specifically, first, normalize was the probability of wavelength being filtered to establish the probability model of wavelength selection, and Monte Carlo random sampling is used to obtain the set of N wavelength subsets. The selected wavelength point is used to establish a PLS model in each sampling process, and the corresponding cross-validation root mean square error (RMSECV) is calculated. After N times of random sampling, the wavelength subset corresponding to the PLS model with minimum RMSECV is selected as the candidate subset. The spectral data contained in the candidate subset is used as a new spectral matrix, and the above process is repeated until the RMSECV no longer drops. After the iteration stops, the candidate subset with the smallest RMSECV is taken as the best wavelength subset. And compared with the three algorithms of Monte Carlo Uninformative Variable Elimination (MCUVE), Genetic Algorithm (GA) and Competitive Adaptive Weight Sampling (CARS). Experimental results show that the algorithm can greatly reduce the number of wavelength points, and the prediction ability of the corresponding PLS model is also improved. In the experimental results of the corn data set, the number of wavelength points is reduced from 700 in the full spectrum to 15. The correlation coefficient of the prediction set is increased from 0.828 2 to 0.931 4, and the RMSEP is reduced from 0.109 8 to 0.071 3. In the experimental results of the gasoline data set, the number of wavelength points was reduced from 301 in the full spectrum to 31. The correlation coefficient of the prediction set was increased from 0.987 5 to 0.993 9, and the RMSEP was reduced from 0.255 to 0.178 8. The performance of this algorithm in the two data sets is better than the three algorithms compared.
Keywords:Near-infrared spectroscopy  Wavelength selection  Direct orthogonal signal correction  Monte Carlo  Partial least squares
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《光谱学与光谱分析》浏览原始摘要信息
点击此处可从《光谱学与光谱分析》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号