首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
The non-linear regression technique known as alternating conditional expectations (ACE) method is only applicable when the number of objects available for calibration is considerably greater than the number of considered predictors. Alternating conditional expectations regression with selection of significant predictors by genetic algorithms (GA-ACE), the non-linear regression technique presented here, is based on the ACE algorithm but introducing several modifications to resolve the applicability limitations of the original ACE method, thus facilitating the practical implementation of a very interesting calibration tool. In order to overcome the lack of reliability displayed by the original ACE algorithm when working on data sets characterized by a too large number of variables and prior to the development of the non-linear regression model, GA-ACE applies genetic algorithms as a variable selection technique to select a reduced subset of significant predictors able to accurately model and predict a considered variable response. Furthermore, GA-ACE actually provides two alternative application approaches, since it allows either the performance of prior data compression computing a number of principal components to be subsequently subjected to GA-selection, or working directly on original variables.In this study, GA-ACE was applied to two real calibration problems, with a very low observation/variable ratio (NIR data), and the results were compared with those obtained by several linear regression techniques usually employed. When using the GA-ACE non-linear method, notably improved regression models were developed for the two response variables modeled, with root mean square errors of the residuals in external prediction (RMSEP) equal to 11.51 and 6.03% for moisture and lipid contents of roasted coffee samples, respectively. The improvement achieved by applying the new non-linear method introduced is even more remarkable taking into account the results obtained with the best performance linear method (IPW-PLS) applied to predict the studied responses (14.61 and 7.74% RMSEP, respectively).  相似文献   

2.
In this study, different approaches to the multivariate calibration of the vapors of two refrigerants are reported. As the relationships between the time-resolved sensor signals and the concentrations of the analytes are nonlinear, the widely used partial least-squares regression (PLS) fails. Therefore, different methods are used, which are known to be able to deal with nonlinearities present in data. First, the Box–Cox transformation, which transforms the dependent variables nonlinearly, was applied. The second approach, the implicit nonlinear PLS regression, tries to account for nonlinearities by introducing squared terms of the independent variables to the original independent variables. The third approach, quadratic PLS (QPLS), uses a nonlinear quadratic inner relationship for the model instead of a linear relationship such as PLS. Tree algorithms are also used, which split a nonlinear problem into smaller subproblems, which are modeled using linear methods or discrete values. Finally, neural networks are applied, which are able to model any relationship. Different special implementations, like genetic algorithms with neural networks and growing neural networks, are also used to prevent an overfitting. Among the fast and simpler algorithms, QPLS shows good results. Different implementations of neural networks show excellent results. Among the different implementations, the most sophisticated and computing-intensive algorithms (growing neural networks) show the best results. Thus, the optimal method for the data set presented is a compromise between quality of calibration and complexity of the algorithm.Electronic Supplementary Material Supplementary material is available for this article at  相似文献   

3.
成忠  诸爱士 《分析化学》2008,36(6):788-792
针对光谱数据峰宽、局部效应显著、含有噪音、变量个数多及彼此间常存在严重的复共线性等问题,改进和设计一种光谱数据局部校正方法:基于窗口平滑的段式正交信号校正方法,并将之结合偏最小二乘回归,以实现光谱数据的预处理及定量分析。通过NIPALS算法初始化将滤去的正交成分,以近邻分段方式进行逐个波长点的正交信号校正。而后将去噪后的光谱矩阵作为新的自变量阵,通过偏最小二乘回归构建其与性质参变量间的校正模型。通过小麦近红外漫反射光谱数据的应用实验结果表明,本方法正交成分估计稳定,去噪明显,模型的预报性能优于其它方法,PLS成分数减少,模型更加简洁。  相似文献   

4.
A new NIR method based on multivariate calibration for determination of ethanol in industrially packed wholemeal bread was developed and validated. GC-FID was used as reference method for the determination of actual ethanol concentration of different samples of wholemeal bread with proper content of added ethanol, ranging from 0 to 3.5% (w/w). Stepwise discriminant analysis was carried out on the NIR dataset, in order to reduce the number of original variables by selecting those that were able to discriminate between the samples of different ethanol concentrations. With the so selected variables a multivariate calibration model was then obtained by multiple linear regression. The prediction power of the linear model was optimized by a new “leave one out” method, so that the number of original variables resulted further reduced.  相似文献   

5.
Laser-Induced Breakdown Spectroscopy (LIBS) has been successfully applied for multi-elemental analysis of solidified mineral melt samples containing several oxides present in various concentrations. The plasma was generated using a Nd:YAG laser and the spectra were acquired using an Echelle spectrometer, coupled to an ICCD detector, which covers a spectral range from 200 to 780 nm. Using a set of 19 calibration samples, we first established univariate calibration curves for the major elements (Al, Fe, Mg, Ca, Ti and Si). We found out that the presence of matrix effects makes such a model, traditionally used in LIBS, not satisfying for quantitative analysis of such samples. Indeed, no sufficiently linear trends can be extracted from the calibration curves for the elements of interest considering all the samples. Instead, a much more robust calibration approach was obtained by considering a multivariate model. The matrix effects are then taken into account by correcting the spectroscopic signals emitted by a given species due the presence of the others ones. More specifically, we established here a calibration model using a 2nd order polynomial linear multivariate inverse regression. The capability of this approach was then checked using a 2nd set of samples with an unknown composition. A good agreement was observed between the analysis provided by X-ray fluorescence (XRF) and the LIBS measurements coupled to the multivariate model for the unknown samples.  相似文献   

6.
An ensemble, a model-independent technique based on combining several models for classification/regression tasks, allows us to achieve a high accuracy that is often not achievable with single models. Such combinations have gained increasing attention in many fields. This paper proposes the use of random subspace (RS)-based regression ensemble as an alternative method for near-infrared (NIR) spectroscopic calibration of tobacco samples. Because of the considerable reduction of variables in a random subspace, multiple linear regression (MLR) is used as the base algorithm and the method is therefore also referred to as RS-MLR. The overall performance of the proposed RS-MLR method is compared to those of partial least square regression (PLSR), kernel principal component regression (KPCR) and kernel partial least square regression (KPLSR). The results reveal that the RS-MLR method not only has a simple concept but also can produce a more parsimonious and more accurate calibration model than PLSR, KPCR and KPLSR, at a lower computational cost. Besides, we also found that the RS-MLR method is very appropriate for the so-called small sample problems and that the calibration models built by RS-MLR are less sensitive to overfitting.  相似文献   

7.
Near-infrared (NIR) spectroscopy, in combination with chemometrics, enable the analysis of raw materials without time-consuming sample preparation methods. The aim of our work was to estimate critical parameters in the analytical specification of oxytetracycline, and consequently the development of a method for quantification and qualification of these parameters by NIR spectroscopy. A Karl Fischer (K.F.) titration to determine the water content, a colorimetric assay method, and Fourier transform-infrared (FT-IR) spectroscopy to identify the oxytetracycline base, were used as reference methods, respectively. Multivariate calibration was performed on NIR spectral data using principal component analysis (PCA), partial least-squares (PLS 1) and principal component regression (PCR) chemometric methods. Multivariate calibration models for NIR spectroscopy have been developed. Using PCA and the Soft Independent Modelling of Class Analogy (SIMCA) approach, we established the cluster model for the determination of sample identity. PLS 1 and PCR regression methods were applied to develop the calibration models for the determination of water content and the assay of the oxytetracycline base. Comparing the PLS and PCR regression methods we found out that the PLS is better established by NIR, especially as the spectroscopic data (NIR spectra) are highly collinear and there are many wavelengths due to non-selective wavelengths. The calibration models for NIR spectroscopy are convenient alternatives to the colorimetric method and to the K.F. method, as well as to FT-IR spectroscopy, in the routine control of incoming material.  相似文献   

8.
Different calibration techniques are available for spectroscopic applications that show nonlinear behavior. This comprehensive comparative study presents a comparison of different nonlinear calibration techniques: kernel PLS (KPLS), support vector machines (SVM), least-squares SVM (LS-SVM), relevance vector machines (RVM), Gaussian process regression (GPR), artificial neural network (ANN), and Bayesian ANN (BANN). In this comparison, partial least squares (PLS) regression is used as a linear benchmark, while the relationship of the methods is considered in terms of traditional calibration by ridge regression (RR). The performance of the different methods is demonstrated by their practical applications using three real-life near infrared (NIR) data sets. Different aspects of the various approaches including computational time, model interpretability, potential over-fitting using the non-linear models on linear problems, robustness to small or medium sample sets, and robustness to pre-processing, are discussed. The results suggest that GPR and BANN are powerful and promising methods for handling linear as well as nonlinear systems, even when the data sets are moderately small. The LS-SVM is also attractive due to its good predictive performance for both linear and nonlinear calibrations.  相似文献   

9.
In this paper, we report the combination of a near-infrared (NIR) spectroscopic method with multivariate analysis in order to develop a calibration model of the saccharification ratio of chemically pretreated Erianthus. The regression models clearly depend on the NIR spectral regions, and the information of CH and aromatic framework vibrations contributed most effectively to the alkaline dataset. From interpretations of the regression coefficient, lignin and cellulose were negatively and positively correlated with the saccharification ratio, respectively, and this result was supported by the data from wet chemical analysis. A more complex dataset was obtained from varied chemical pretreatments; here, the saccharification ratio was either small or had no linear correlation with each structural monocomponent. These results enabled the successful construction of the PLS regression model. NIR spectroscopy can be a rapid screening method for the saccharification ratio, and furthermore, can provide information of the key factors influencing the realization of more efficient enzymatic accessibility.  相似文献   

10.
Multivariate calibration problems often involve the identification of a meaningful subset of variables, from a vast number of variables for better prediction of output variables. A new graph theoretic method based on partial correlations (variable interaction network—VIN) is proposed. Many well studied representative calibration datasets spanning different application domains are selected for investigating the performance. Partial least squares (PLS) regression models combined with variable selection techniques are employed for benchmarking the performance. Subsets of variables with different number of variables are retained for the final analysis after VIN selection and progressive prediction accuracies are used for comparison. VIN-PLS results show significant improvement in prediction efficiencies and variable subset optimization. Improvement of up to 45% over existing methods with significantly fewer variables is achieved using the new method. Advantages of VIN based variable selection are highlighted.  相似文献   

11.
Bayesian latent variable regression (BLVR) aims to utilize all available information for empirical modeling via a Bayesian framework. Such information includes prior knowledge about the underlying variables, model parameters and measurement error distributions. This paper improves upon the existing optimization‐based BLVR (BLVR‐OPT) method [1] by developing a sampling‐based Bayesian latent variable regression (BLVR‐S) method that relies on Gibbs sampling. Use of the sampling‐based framework not only provides point estimates, but its ability to generate samples that represent the posterior distribution of the unknown variables, also readily provides error bounds. Features and advantages of this method are demonstrated via examples based on simulated data and real Near‐Infrared (NIR) spectroscopy data. Practical aspects of Bayesian modeling such as determining when the extra computation may be worth the effort are addressed by an empirical study of the effects of the amount of training data and signal to noise ratio (SNR). The benefits of BLVR seem to be most significant when the number of measurements is limited and when noise in output variables is relatively large. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

12.
The successive projections algorithm (SPA) is widely used to select variables for multiple linear regression (MLR) modeling. However, SPA used only once may not obtain all the useful information of the full spectra, because the number of selected variables cannot exceed the number of calibration samples in the SPA algorithm. Therefore, the SPA-MLR method risks the loss of useful information. To make a full use of the useful information in the spectra, a new method named “consensus SPA-MLR” (C-SPA-MLR) is proposed herein. This method is the combination of consensus strategy and SPA-MLR method. In the C-SPA-MLR method, SPA-MLR is used to construct member models with different subsets of variables, which are selected from the remaining variables iteratively. A consensus prediction is obtained by combining the predictions of the member models. The proposed method is evaluated by analyzing the near infrared (NIR) spectra of corn and diesel. The results of C-SPA-MLR method showed a better prediction performance compared with the SPA-MLR and full-spectra PLS methods. Moreover, these results could serve as a reference for combination the consensus strategy and other variable selection methods when analyzing NIR spectra and other spectroscopic techniques.  相似文献   

13.
A new variable selection algorithm is described, based on ant colony optimization (ACO). The algorithm aim is to choose, from a large number of available spectral wavelengths, those relevant to the estimation of analyte concentrations or sample properties when spectroscopic analysis is combined with multivariate calibration techniques such as partial least-squares (PLS) regression. The new algorithm employs the concept of cooperative pheromone accumulation, which is typical of ACO selection methods, and optimizes PLS models using a pre-defined number of variables, employing a Monte Carlo approach to discard irrelevant sensors. The performance has been tested on a simulated system, where it shows a significant superiority over other commonly employed selection methods, such as genetic algorithms. Several near infrared spectroscopic experimental data sets have been subjected to the present ACO algorithm, with PLS leading to improved analytical figures of merit upon wavelength selection. The method could be helpful in other chemometric activities such as classification or quantitative structure-activity relationship (QSAR) problems.  相似文献   

14.
In the past decade, there has been an increase in the use of sparse multivariate calibration methods in chemometrics. Sparsity describes a parsimonious state of model complexity and can be defined in terms of a subset of samples or covariates (e.g., wavelengths) that are used to define the calibration model. With respect to their classical counterparts such as principal component regression or partial least squares, sparse models are more easily interpretable and have been shown to exhibit non‐inferior prediction performance. However, sparse methods are still not as fast as the classical methods in spite of recent numerical advances. In addition, for many chemometricians, sparse methods are still “black‐box” algorithms whose internal workings are not well understood. In this paper, we describe a simple framework whereby classical multivariate calibration methods can be iteratively used to generate sparse models. Moreover, this approach allows for either wavelength or sample sparsity. We demonstrate the effectiveness of this approach on two spectroscopic data sets. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

15.
Consensus methods have presented promising tools for improving the reliability of quantitative models in near-infrared(NIR) spectroscopic analysis.A strategy for improving the performance of consensus methods in multivariate calibration of NIR spectra is proposed.In the approach,a subset of non-collinear variables is generated using successive projections algorithm(SPA) for each variable in the reduced spectra by uninformative variables elimination(UVE).Then sub-models are built using the variable subsets and the calibration subsets determined by Monte Carlo(MC) re-sampling,and the sub-model that produces minimal error in cross validation is selected as a member model.With repetition of the MC re-sampling,a series of member models are built and a consensus model is achieved by averaging all the member models.Since member models are built with the best variable subset and the randomly selected calibration subset,both the quality and the diversity of the member models are insured for the consensus model.Two NIR spectral datasets of tobacco lamina are used to investigate the proposed method.The superiority of the method in both accuracy and reliability is demonstrated.  相似文献   

16.
The calibration performance of partial least squares regression for one response (PLS1) can be improved by eliminating uninformative variables. Many variable-reduction methods are based on so-called predictor-variable properties or predictive properties, which are functions of various PLS-model parameters, and which may change during the steps of the variable-reduction process. Recently, a new predictive-property-ranked variable reduction method with final complexity adapted models, denoted as PPRVR-FCAM or simply FCAM, was introduced. It is a backward variable elimination method applied on the predictive-property-ranked variables. The variable number is first reduced, with constant PLS1 model complexity A, until A variables remain, followed by a further decrease in PLS complexity, allowing the final selection of small numbers of variables.  相似文献   

17.
With projection based calibration approaches, such as partial least squares (PLS) and principal component regression (PCR), the calibration space is spanned by respective basis vectors (latent vectors). Up to rank k basis vectors are formed where k ≤ min(m,n) with m and n denoting the number of calibration samples and measured variables. The user needs to decide how many and which respective basis vectors (tuning parameters). To avoid the second issue, basis vectors are selected top‐down starting with the first and sequentially adding until model criteria are satisfied. Ridge regression (RR) avoids the issues by using the full set of basis vectors. Another approach is to select a subset from the total available. The presented work develops a process based on the L1 vector norm to select basis vectors. Specifically, the L1 norm is used to select singular value decomposition (SVD) basis set vectors for PCR (LPCR). Because PCR, PLS, RR, and others can be expressed as linear combination of the SVD basis vectors, the focus is on selection and comparison using the SVD basis set. Results based on respective tuning parameter selections and weights applied to the SVD basis vectors for LPCR, top‐down PCR, correlation PCR (CPCR), PLS, and RR are compared for calibration and calibration updating using spectroscopic data sets. The methods are found to predict equivalently. In particular, the L1 norm produces similar results to those obtained by the well‐studied CPCR process. Thus, the new method provides a different theoretical framework than CPCR for selecting basis vectors. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

18.
This paper reports the results of a rapid method to determine sucrose in chocolate mass using near infrared spectroscopy (NIRS). We applied a broad-based calibration approach, which consists in putting together in one single calibration samples of various types of chocolate mass. This approach increases the concentration range for one or more compositional parameters, improves the model performance and requires just one calibration model for several recipes. The data were modelled using partial least squares (PLS) and multiple linear regression (MLR). The MLR models were developed using a variable selection based on the coefficient regression of PLS and genetic algorithm (GA). High correlation coefficients (0.998, 0.997, 0.998 for PLS, MLR and GA-MLR, respectively) and low prediction errors confirms the good predictability of the models. The results show that NIR can be used as rapid method to determine sucrose in chocolate mass in chocolate factories.  相似文献   

19.
Linear and non-linear calibration methods (principal component regression (PCR), partial least squares regression (PLS), and neural networks (NN)) were applied to a slightly non-linear Raman data set. Because of the large size of this data set, recently introduced linear calibration methods, specifically optimised for speed, were also used. These fast methods achieve speed improvement by using the Lanczos decomposition for the singular value decomposition steps of the calibration procedures, and for some of their variants, by optimising the models without cross-validation (CV). Linear methods could deal with the slight non-linearity present in the data by including extra components, therefore, performing comparably to NNs. The fast methods performed as well as their classical equivalents in terms of precision in prediction, but the results were obtained considerably faster. It, however, appeared that CV remains the most appropriate method for model complexity estimation.  相似文献   

20.
We have studied rapid calibration models to predict the composition of a variety of biomass feedstocks by correlating near-infrared (NIR) spectroscopic data to compositional data produced using traditional wet chemical analysis techniques. The rapid calibration models are developed using multivariate statistical analysis of the spectroscopic and wet chemical data. This work discusses the latest versions of the NIR calibration models for corn stover feedstock and dilute-acid pretreated corn stover. Measures of the calibration precision and uncertainty are presented. No statistically significant differences (p = 0.05) are seen between NIR calibration models built using different mathematical pretreatments. Finally, two common algorithms for building NIR calibration models are compared; no statistically significant differences (p = 0.05) are seen for the major constituents glucan, xylan, and lignin, but the algorithms did produce different predictions for total extractives. A single calibration model combining the corn stover feedstock and dilute-acid pretreated corn stover samples gave less satisfactory predictions than the separate models.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号