首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Two novel algorithms which employ the idea of stacked generalization or stacked regression, stacked partial least squares (SPLS) and stacked moving‐window partial least squares (SMWPLS) are reported in the present paper. The new algorithms establish parallel, conventional PLS models based on all intervals of a set of spectra to take advantage of the information from the whole spectrum by incorporating parallel models in a way to emphasize intervals highly related to the target property. It is theoretically and experimentally illustrated that the predictive ability of these two stacked methods combining all subsets or intervals of the whole spectrum is never poorer than that of a PLS model based only on the best interval. These two stacking algorithms generate more parsimonious regression models with better predictive power than conventional PLS, and perform best when the spectral information is neither isolated to a single, small region, nor spread uniformly over the response. A simulation data set is employed in this work not only to demonstrate this improvement, but also to demonstrate that stacked regressions have the potential capability of predicting property information from an outlier spectrum in the prediction set. Moisture, oil, protein and starch in Cargill corn samples have been successfully predicted by these new algorithms, as well as hydroxyl number for different instruments of terpolymer samples including and excluding an outlier spectrum. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

2.
Partial Least Squares (PLS) is by far the most popular regression method for building multivariate calibration models for spectroscopic data. However, the success of the conventional PLS approach depends on the availability of a ‘representative data set’ as the model needs to be trained for all expected variation at the prediction stage. When the concentration of the known interferents and their correlation with the analyte of interest change in a fashion which is not covered in the calibration set, the predictive performance of inverse calibration approaches such as conventional PLS can deteriorate. This underscores the need for calibration methods that are capable of building multivariate calibration models which can be robustified against the unexpected variation in the concentrations and the correlations of the known interferents in the test set. Several methods incorporating ‘a priori’ information such as pure component spectra of the analyte of interest and/or the known interferents have been proposed to build more robust calibration models. In the present study, four such calibration techniques have been benchmarked on two data sets with respect to their predictive ability and robustness: Net Analyte Preprocessing (NAP), Improved Direct Calibration (IDC), Science Based Calibration (SBC) and Augmented Classical Least Squares (ACLS) Calibration. For both data sets, the alternative calibration techniques were found to give good prediction performance even when the interferent structure in the test set was different from the one in the calibration set. The best results were obtained by the ACLS model incorporating both the pure component spectra of the analyte of interest and the interferents, resulting in a reduction of the RMSEP by a factor 3 compared to conventional PLS for the situation when the test set had a different interferent structure than the one in the calibration set.  相似文献   

3.
In recent years the number of spectroscopic studies utilizing multivariate techniques and involving different laboratories has been dramatically increased. In this paper the protocol for calibration transfer of partial least square regression model between high‐resolution nuclear magnetic resonance (NMR) spectrometers of different frequencies and equipped with different probes was established. As the test system previously published quantitative model to predict the concentration of blended soy species in sunflower lecithin was used. For multivariate modelling piecewise direct standardization (PDS), direct standardization, and hybrid calibration were employed. PDS showed the best performance for estimating lecithin falsification regarding its vegetable origin resulting in a significant decrease in root mean square error of prediction from 5.0 to 7.3% without standardization to 2.9–3.2% for PDS. Acceptable calibration transfer model was obtained by direct standardization, but this standardization approach introduces unfavourable noise to the spectral data. Hybrid calibration is least recommended for high‐resolution NMR data. The sensitivity of instrument transfer methods with respect to the type of spectrometer, the number of samples and the subset selection was also discussed. The study showed the necessity of applying a proper standardization procedure in cases when multivariate model has to be applied to the spectra recorded on a secondary NMR spectrometer even with the same magnetic field strength. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

4.
Near-infrared spectroscopy (NIR) models built on a particular instrument are often invalid on other instruments due to spectral inconsistencies between the instruments. In the present work, global and robust NIR calibration models were constructed by partial least square (PLS) regression based on hybrid calibration sets, which are composed of both primary and secondary spectra. Three datasets were used as case studies. The first consisted of 72 radix scutellaria samples measured on two NIR spectrometers with known baicalin content. The second was composed of 80 corn samples measured on two instruments with known moisture, oil, and protein concentrations. The third dataset included 279 primary samples of tobacco with known nicotine content and 78 secondary samples of tobacco with known nicotine concentrations. The effect of the number of secondary spectra in the hybrid calibration sets and the methods for selecting secondary spectra on the PLS model performance were investigated by comparing the results obtained from different calibration sets. This study shows that the global and robust calibration models accurately predicted both primary and secondary samples as long as the ratios of the number of primary spectra to the number of secondary spectra were less than 22. The models performance was not influenced by the selection method of the secondary spectra. The hybrid calibration sets included the primary spectral information and also the secondary spectra; information, rendering the constructed global and robust models applicable to both primary and secondary instruments.  相似文献   

5.
Zhu D  Ji B  Meng C  Shi B  Tu Z  Qing Z 《Analytica chimica acta》2007,598(2):227-234
The ν-support vector regression (ν-SVR) was used to construct the calibration model between soluble solids content (SSC) of apples and acousto-optic tunable filter near-infrared (AOTF-NIR) spectra. The performance of ν-SVR was compared with the partial least square regression (PLSR) and the back-propagation artificial neural networks (BP-ANN). The influence of SVR parameters on the predictive ability of model was investigated. The results indicated that the parameter ν had a rather wide optimal area (between 0.35 and 1 for the apple data). Therefore, we could determine the value of ν beforehand and focus on the selection of other SVR parameters. For analyzing SSC of apple, ν-SVR was superior to PLSR and BP-ANN, especially in the case of fewer samples and treating the noise polluted spectra. Proper spectra pretreatment methods, such as scaling, mean center, standard normal variate (SNV) and the wavelength selection methods (stepwise multiple linear regression and genetic algorithm with PLS as its objective function), could improve the quality of ν-SVR model greatly.  相似文献   

6.
基于多模型(模型融合)建模的思想,开发了两种新的叠加多元校正分析算法:叠加PCR(PLS)多元校正分析和叠加移动窗口PCR(PLS)多元校正分析。与一般的多模型建模方法不同的是其通过赋予光谱数据中的不同部分不同权重叠加子多元校正模型。因此,其可以通过权重调节或选择变量。在消除光谱数据中常见的冗余信息的同时,避免信息遗漏的缺点,并最终提高模型的稳健性,简化了模型。对于这两个新的算法,尽管其具体步骤不同,但仍取得了相似的预测结果。本文通过两套近红外光谱文献数据计算验证了这两个新方法的优越性。  相似文献   

7.
The main part of the wide array of different calibration transfer methods found in literature is dedicated to two-way data arrangements (m×n matrices). Less work has been done within the area of calibration transfer for three-way data structures (m×n×l tensors) such as calibrations made for excitation-emission-matrix (EEM) fluorescence spectra. There are two possible ways to attack the problem for EEM transfer. Either the tensors are unfolded to two-way data, whereby the existing methods can be applied, or new methods dedicated to three-way calibration transfer have to be developed. This paper presents and compares both. It was possible to make a local linear pixel-based model that could be used for transfer of EEM's. This new method has a similar performance to the classical methods found in literature, direct- and piecewise direct standardization. The three-way advantages made it possible to use as few as four samples to build useable transfer models. Care has to be taken though when choosing the samples. When subset recalibration of the systems is compared to calibration transfer, better performance is seen for the transferred calibrations. Overall the three-way calibration transfer methods have a slightly better performance than the two-way methods.  相似文献   

8.
To transfer a calibration model in cases where the standardization samples are rare or unstable, a method based on orthogonal space regression (OSR) is proposed. It uses virtual standardization spectra to account for response changes between instruments or batches. A comparative study of the proposed OSR, piecewise direct standardization, finite impulse response, orthogonal signal correction, and model updating (MU) was conducted on both pharmaceutical tablet data and chlorogenic acid data. The results of these studies suggest that both the OSR and the MU are superior to the other transfer techniques in terms of root‐mean‐squared error of prediction and ratio of performance to interquartile distance. Moreover, OSR requires no identical standard samples, and it avoids re‐optimizing the transfer models. In conclusion, both the differences among spectra measured on different spectrometers and the differences between different batches can be corrected successfully using the OSR method. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

9.
The UV spectrophotometric analysis of a multicomponent mixture containing paracetamol, caffeine, tripelenamine and salicylamide by using multivariate calibration methods, such as principal component regression (PCR) and partial least-squares regression (PLS), was described. The calibration set was based on 47 reference samples, consisting of quaternary, ternary, binary and single-component mixtures, with the aim to develop models able to predict the concentrations of unknown samples containing as many as one-to-four components. The calibration models were optimized by an appropriate selection of the number of factors as well as wavelength ranges to be used for building up the data matrix and excluding any information about the interfering excipients included in pharmaceutics. The PCR and PLS models were compared and their predictive performance was inferred by a successful application to the assays of synthetic mixtures and pharmaceutical formulations.  相似文献   

10.
Preprocessing of raw near-infrared (NIR) spectral data is indispensable in multivariate calibration when the measured spectra are subject to significant noises, baselines and other undesirable factors. However, due to the lack of sufficient prior information and an incomplete knowledge of the raw data, NIR spectra preprocessing in multivariate calibration is still trial and error. How to select a proper method depends largely on both the nature of the data and the expertise and experience of the practitioners. This might limit the applications of multivariate calibration in many fields, where researchers are not very familiar with the characteristics of many preprocessing methods unique in chemometrics and have difficulties to select the most suitable methods. Another problem is many preprocessing methods, when used alone, might degrade the data in certain aspects or lose some useful information while improving certain qualities of the data. In order to tackle these problems, this paper proposes a new concept of data preprocessing, ensemble preprocessing method, where partial least squares (PLSs) models built on differently preprocessed data are combined by Monte Carlo cross validation (MCCV) stacked regression. Little or no prior information of the data and expertise are required. Moreover, fusion of complementary information obtained by different preprocessing methods often leads to a more stable and accurate calibration model. The investigation of two real data sets has demonstrated the advantages of the proposed method.  相似文献   

11.
A new ensemble learning algorithm is presented for quantitative analysis of near-infrared spectra. The algorithm contains two steps of stacked regression and Partial Least Squares (PLS), termed Dual Stacked Partial Least Squares (DSPLS) algorithm. First, several sub-models were generated from the whole calibration set. The inner-stack step was implemented on sub-intervals of the spectrum. Then the outer-stack step was used to combine these sub-models. Several combination rules of the outer-stack step were analyzed for the proposed DSPLS algorithm. In addition, a novel selective weighting rule was also involved to select a subset of all available sub-models. Experiments on two public near-infrared datasets demonstrate that the proposed DSPLS with selective weighting rule provided superior prediction performance and outperformed the conventional PLS algorithm. Compared with the single model, the new ensemble model can provide more robust prediction result and can be considered an alternative choice for quantitative analytical applications.  相似文献   

12.
This paper proposes a novel calibration technique based on combining support vector regression with a digital band pass (DBP) filter for the quantitative analysis of near‐infrared spectra. The efficacy of the proposed method is investigated and validated in the determination of glucose from near‐infrared spectra of a mixture composed of urea, triacetin and glucose. In this paper, the DBP filtering was implemented as a pre‐processing technique in the frequency domain as a Gaussian band pass filter and in the time domain as a Chebyshev filter. The grid‐search optimization method was used to optimize the filter parameters. The results demonstrate that utilization of the optimized DBP filters as a pre‐processing technique improved the performance of the predictive models. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

13.
Electronic nose systems when deployed in network mesh can effectively provide a low budget and onsite solution for the industrial obnoxious gaseous measurement. For accurate and identical prediction capability by all the electronic nose systems, a reliable calibration transfer model needs to be implemented in order to overcome the inherent sensor array variability. In this work, robust regression (RR) is used for calibration transfer between two electronic nose systems using a Box–Behnken (BB) design. Out of the two electronic nose systems, one was trained using industrial gas samples by four artificial neural network models, for the measurement of obnoxious odours emitted from pulp and paper industries. The emissions constitute mainly of hydrogen sulphide (H2S), methyl mercaptan (MM), dimethyl sulphide (DMS) and dimethyl disulphide (DMDS) in different proportions. A Box–Behnken design consisting of 27 experiment sets based on synthetic gas combinations of H2S, MM, DMS and DMDS, were conducted for calibration transfer between two identical electronic nose systems. Identical sensors on both the systems were mapped and the prediction models developed using ANN were then transferred to the second system using BB–RR methodology. The results showed successful transmission of prediction models developed for one system to other system, with the mean absolute error between the actual and predicted concentration of analytes in mg L−1 after calibration transfer (on second system) being 0.076, 0.1801, 0.0329, 0.427 for DMS, DMDS, MM, H2S respectively.  相似文献   

14.
This paper presents a Bayesian approach to the development of spectroscopic calibration models. By formulating the linear regression in a probabilistic framework, a Bayesian linear regression model is derived, and a specific optimization method, i.e. Bayesian evidence approximation, is utilized to estimate the model “hyper-parameters”. The relation of the proposed approach to the calibration models in the literature is discussed, including ridge regression and Gaussian process model. The Bayesian model may be modified for the calibration of multivariate response variables. Furthermore, a variable selection strategy is implemented within the Bayesian framework, the motivation being that the predictive performance may be improved by selecting a subset of the most informative spectral variables. The Bayesian calibration models are applied to two spectroscopic data sets, and they demonstrate improved prediction results in comparison with the benchmark method of partial least squares.  相似文献   

15.
A calibration transfer method for near-infrared (NIR) spectra based on spectral regression is proposed. Spectral regression method can reveal low dimensional manifold structure in high dimensional spectroscopic data and is suitable to transfer the NIR spectra of different instruments. A comparative study of the proposed method and piecewise direct standardization (PDS) for standardization on two benchmark NIR data sets is presented. Experimental results show that spectral regression method outperforms PDS and is quite competitive with PDS with background correction. When the standardization subset has sufficient samples, spectral regression method exhibits excellent performance.  相似文献   

16.
Partial least squares (PLS) models of 10 important jet and diesel fuel properties were built using spectra from a master near‐IR dispersive instrument and then subsequently transferred to a secondary dispersive instrument via a novel calibration transfer method using virtual standards and a slope‐bias correction. Implementation of the transfer requires that only seven spectra of neat solvents be acquired on the master and secondary instruments. The spectra of the neat solvents are then used to digitally replicate spectra from the calibration set to generate virtual standards. Comparison of PLS predictions for the master and secondary instrument virtual standards provides a simple but effective slope‐bias correction for transfer. The transferred fuel properties include American Petroleum Institute gravity, % aromatics, cetane index, flashpoint, hydrogen content, % saturates, and distillation temperatures at 10%, 20%, 50%, and 90% volume recovered. Transfer error was lower than using either the pure solvents with a slope‐bias correction or than using a piecewise direct standardization calibration transfer using fuel spectra. Transfer error was higher than when using actual fuels to transfer the calibration. The use of virtual standards eliminates the need to maintain either complex fuel standards or the master instrument for future instrument calibration transfers. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

17.
以26个植物纤维原料为实验材料,由20个样品作校正样品,采用径向基核函数方法对纤维原料中甲氧基含量与纤维原料样品近红外光谱进行支持向量机(SVM)回归建模.以所建SVM回归模型对6个纤维原料样品中甲氧基含量进行预测,回归模型的预测结果与采用改良的维伯克法确定的甲氧基含量的相关系数为0.977,预测样本集的标准偏差为0.43.将SVM回归模型的预测效果与PLS回归模型的预测结果进行比较,所建近红外光谱测定植物纤维原料中甲氧基含量的SVM回归模型可用于实际植物纤维原料样品的定量分析,且具有较好的分析效果.  相似文献   

18.
该研究利用一维尺度不变特征变换(SIFT)算法寻找烟叶近红外光谱(Near infrared spectroscopy,NIRS)的稳定特征波长,根据样品精密度测试光谱筛选的波长计算重现率和重现度,采用L_9(3~3)正交表优化SIFT算法中的相关参数,使重现率和重现度尽可能高。基于优化的参数和主机上10个代表性样品的光谱,筛选出10个稳定特征波长集合,以这些波长集合并集的光谱响应为自变量,采用偏最小二乘(PLS)方法构建烟叶总植物碱NIRS模型(简称SIFT-PLS)。该模型直接传递到3台从机后,对3台从机样品总植物碱的平均相对预测误差(MRE)均满足小于6%的企业内控要求,而全光谱模型(WW-PLS)直接转移后仅1台从机的MRE满足要求,经分段直接校正(PDS)方法校正从机光谱后,WW-PLS模型也仅对1台从机的MRE小于6%。采用SIFT算法筛选稳定特征波长建立的NIRS模型可在3台从机直接共享,无需转移集,不需对从机光谱或光谱模型进行校正,实现了真正意义的无标样NIRS模型的直接转移。  相似文献   

19.
Different calibration techniques are available for spectroscopic applications that show nonlinear behavior. This comprehensive comparative study presents a comparison of different nonlinear calibration techniques: kernel PLS (KPLS), support vector machines (SVM), least-squares SVM (LS-SVM), relevance vector machines (RVM), Gaussian process regression (GPR), artificial neural network (ANN), and Bayesian ANN (BANN). In this comparison, partial least squares (PLS) regression is used as a linear benchmark, while the relationship of the methods is considered in terms of traditional calibration by ridge regression (RR). The performance of the different methods is demonstrated by their practical applications using three real-life near infrared (NIR) data sets. Different aspects of the various approaches including computational time, model interpretability, potential over-fitting using the non-linear models on linear problems, robustness to small or medium sample sets, and robustness to pre-processing, are discussed. The results suggest that GPR and BANN are powerful and promising methods for handling linear as well as nonlinear systems, even when the data sets are moderately small. The LS-SVM is also attractive due to its good predictive performance for both linear and nonlinear calibrations.  相似文献   

20.
预测毛细管区带电泳有效淌度的支持向量回归建模方法   总被引:3,自引:0,他引:3  
康宇飞  瞿海斌  沈朋  程翼宇 《分析化学》2004,32(9):1151-1155
提出预测毛细管电泳迁移行为的支持向量回归建模方法。以核苷为实际研究对象,利用正交试验获得的数据,结合二标记物技术,用支持向量回归算法建立毛细管区带电泳的柱温、电压、缓冲液浓度和pH值与3种核苷的有效淌度之间的相关模型。将其与偏最小二乘回归和人工神经网络方法相比较,结果表明所建模型的预测准确性优于后两者,适宜用于毛细管电泳迁移行为的预测。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号