首页 | 本学科首页   官方微博 | 高级检索  
     

光谱预处理方法选择研究
引用本文:第五鹏瑶,卞希慧,王姿方,刘巍. 光谱预处理方法选择研究[J]. 光谱学与光谱分析, 2019, 39(9): 2800-2806. DOI: 10.3964/j.issn.1000-0593(2019)09-2800-07
作者姓名:第五鹏瑶  卞希慧  王姿方  刘巍
作者单位:天津工业大学省部共建分离膜与膜过程国家重点实验室,环境与化学工程学院,天津 300387;天津工业大学省部共建分离膜与膜过程国家重点实验室,环境与化学工程学院,天津 300387;天津工业大学省部共建分离膜与膜过程国家重点实验室,环境与化学工程学院,天津 300387;天津工业大学省部共建分离膜与膜过程国家重点实验室,环境与化学工程学院,天津 300387
基金项目:国家自然科学基金项目(21405110)和天津市教委科研计划项目(2018KJ200)资助
摘    要:复杂样品光谱信号往往会受到杂散光、噪声、基线漂移等因素的干扰,从而影响最终的定性定量分析结果,因此通常需要在建模前对原始光谱进行预处理。目前已有的光谱预处理方法包括很多种,如何寻找合适的预处理方法是很棘手的问题。一种途径是观察光谱信号特点选择预处理方法(visual inspection),另一种途径是根据建模性能的优劣反过来选择预处理方法(trial-and-error strategy)。前者无需建模,更具有解释性,但是有时会由于选择者主观的因素导致错误的结果;后者无需观察光谱特点,但需要考察大量的预处理方法,对大数据集比较费时。因此需要探讨哪种选择方式更科学与合理。本研究采用9组数据,通过对10种预处理方法的120种排列组合来探讨预处理的必要性及预处理方法的选择。首先,优化偏最小二乘(PLS)的因子数及一阶导数、二阶导数、SG平滑的窗口参数,连续小波变换(CWT)的小波函数和分解尺度。然后把无预处理及一阶导数、二阶导数、CWT、多元散射校正(MSC)、标准正态变量(SNV)、SG平滑、中心化、Pareto尺度化、最大最小归一化、标准化10种预处理方法按照背景校正、散射校正、平滑和尺度化的顺序进行排列组合,得到120种预处理及其组合方法。最后对不同数据及相同数据的不同组分分别进行120种预处理,分析光谱信号特点及预处理后PLS建模的预测均方根误差值(RMSEP)。结果表明,相比观察光谱信号特点,根据光谱与预测组分的建模效果可以更为准确地选择最佳预处理方法。对于多数数据,采用合适的预处理方法可以提高建模效果;对于不同的数据集,因为其数据集信息和复杂性不同,所以其最佳预处理方法也不同;对于相同数据集,即使光谱相同,但不同组分的预处理方法也不相同。因此,不存在普适性的最佳预处理方法,最佳预处理方法除了与光谱有关,还与预测组分有关。通过对已有预处理方法按照预处理目的进行分类再排列组合是选择最佳预处理方法的一种有效途径。

关 键 词:预处理方法  复杂样品  偏最小二乘  参数优化  方法选择
收稿时间:2018-08-02

Study on the Selection of Spectral Preprocessing Methods
DIWU Peng-yao,BIAN Xi-hui,WANG Zi-fang,LIU Wei. Study on the Selection of Spectral Preprocessing Methods[J]. Spectroscopy and Spectral Analysis, 2019, 39(9): 2800-2806. DOI: 10.3964/j.issn.1000-0593(2019)09-2800-07
Authors:DIWU Peng-yao  BIAN Xi-hui  WANG Zi-fang  LIU Wei
Affiliation:State Key Laboratory of Separation Membranes and Membrane Processes, School of Environmental and Chemical Engineering, Tianjin Polytechnic University, Tianjin 300387, China
Abstract:Spectral signals of complex samples are usually disturbed by stray light, noise, baseline drift and other undesirable factors, which can affect the final qualitative and quantitative analysis results. Therefore, it is necessary to pretreat the raw spectra before modeling. How to find a proper preprocessing method from the existing spectral preprocessing methods is a difficult problem. One strategy is to choose the optimal preprocessing by observing the characteristics of the spectral signal directly, which does not require modeling and is more explanatory. However, it may be difficult and subjective for subtle or multiple interferences and lead to misleading results. Another strategy is based on the modeling performance, which does not need observe the spectral characteristics, but numerous processing methods need to investigate which is time-consuming for large datasets. In summary, it is necessary to explore which selection method is more scientific and reasonable. In this study, nine datasets were used to investigate the necessity of preprocessing and the choice of preprocessing methods by arranging and combining of 10 preprocessing methods. Firstly, the latent variables of partial least squares (PLS), the window size of first derivative (1st Der), second derivative (2nd Der) and SG smoothing, the wavelet function and decomposition scale of continuous wavelet transform (CWT) were optimized, respectively. Then, non-preprocessing and 10 preprocessing methods including 1st Der, 2nd Der, CWT, multiplicative scatter correction (MSC), standard normal variate (SNV), SG smoothing, mean centering, normalization, Pareto scaling, auto scaling were combined in order of baseline correction, scattering correction, smoothing and scaling. A total of 120 preprocessing and their combinations were obtained. Finally, the characteristics of spectral signals and the root mean squared error of prediction (RMSEP) with PLS for 120 preprocessing methods were analyzed for the nine datasets and the same dataset with different components. Results show that compared with observing the characteristics of spectral signals, the optimal preprocessing method can be selected more accurately according to the modeling performance of the spectra and predictive components. For most datasets, appropriate preprocessing method can improve the modeling performance. For different datasets, the optimal preprocessing method is different because of the different information and complexity of the datasets. For the same dataset, the optimal preprocessing methods for different components are also different even if the spectra are the same. Thus, it can be concluded that no universal preprocessing method exists. The optimal preprocessing method is related to the spectra and the predictive components. Furthermore, it is an effective way to select the optimal pretreatment method by sorting and combining the existing preprocessing methods according to the preprocessing purpose.
Keywords:Preprocessing method  Complex sample  Partial least squares  Parameter optimization  Method selection  
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《光谱学与光谱分析》浏览原始摘要信息
点击此处可从《光谱学与光谱分析》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号