首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Most multivariate calibration methods require selection of tuning parameters, such as partial least squares (PLS) or the Tikhonov regularization variant ridge regression (RR). Tuning parameter values determine the direction and magnitude of respective model vectors thereby setting the resultant predication abilities of the model vectors. Simultaneously, tuning parameter values establish the corresponding bias/variance and the underlying selectivity/sensitivity tradeoffs. Selection of the final tuning parameter is often accomplished through some form of cross-validation and the resultant root mean square error of cross-validation (RMSECV) values are evaluated. However, selection of a “good” tuning parameter with this one model evaluation merit is almost impossible. Including additional model merits assists tuning parameter selection to provide better balanced models as well as allowing for a reasonable comparison between calibration methods. Using multiple merits requires decisions to be made on how to combine and weight the merits into an information criterion. An abundance of options are possible. Presented in this paper is the sum of ranking differences (SRD) to ensemble a collection of model evaluation merits varying across tuning parameters. It is shown that the SRD consensus ranking of model tuning parameters allows automatic selection of the final model, or a collection of models if so desired. Essentially, the user’s preference for the degree of balance between bias and variance ultimately decides the merits used in SRD and hence, the tuning parameter values ranked lowest by SRD for automatic selection. The SRD process is also shown to allow simultaneous comparison of different calibration methods for a particular data set in conjunction with tuning parameter selection. Because SRD evaluates consistency across multiple merits, decisions on how to combine and weight merits are avoided. To demonstrate the utility of SRD, a near infrared spectral data set and a quantitative structure activity relationship (QSAR) data set are evaluated using PLS and RR.  相似文献   

2.
Modeling quantitative structure–activity relationships (QSAR) is considered with an emphasis on prediction. An abundance of methods are available to develop such models. Using a harmonious approach that balances the bias and variance of predictions, the best calibration models are identified relative to the bias and variance criteria used. Criteria utilized to determine the adequacy of models are the root mean square error of calibration (RMSEC) and validation (RMSEV), respective R 2 values, and the norm of the regression vector. QSAR data from the literature are used to demonstrate concepts. For these data sets and criteria used, it is suggested that models obtained by ridge regression (RR) are more harmonious and parsimonious than models obtained by partial least squares (PLS) and principal component regression (PCR) when the data is mean-centered. The most harmonious RR models have the best bias/variance tradeoff reflected by the smallest RMSEC, RMSEV, and regression vector norms and the largest calibration and validation R 2 values. The most parsimonious RR models have the smallest effective rank.  相似文献   

3.
Diagnostics are fundamental to multivariate calibration (MC). Two common diagnostics are leverages and spectral F‐ratios and these have been formulated for many MC methods such as partial least square (PLS), principal component regression (PCR) and classical least squares (CLS). While these are some of the most common methods of calibration in analytical chemistry, ridge regression is also common place and yet spectral F‐ratios have not been developed for it. Noting that ridge regression is a form of Tikhonov regularization (TR) and using the unifying filter factor representation for MC, this paper develops the filter factor form of leverages and spectral F‐ratios. The approach is applied to a spectral data set to demonstrate computational speed‐up advantages and ease of implementation for the filter factor representation. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

4.
Partial least-squares (PLS) calibration models have been generated from a series of near-infrared (near-IR) and Raman spectra acquired separately from sixty different mixed solutions of glucose, lactate, and urea in aqueous phosphate buffer. Independent PLS models were prepared and compared for glucose, lactate, and urea. Near-IR and Raman spectral features differed substantially for these solutes, with Raman spectra enabling greater distinction with less spectral overlap than features in the near-IR spectra. Despite this, PLS models derived from near-IR spectra outperformed those from Raman spectra. Standard errors of prediction were 0.24, 0.11, and 0.14 mmol L−1 for glucose, lactate, and urea, respectively, from near-IR spectra and 0.40, 0.42, and 0.36 mmol L−1 for glucose, lactate, and urea, respectively, from Raman spectra. Differences between instrumental signal-to-noise ratios were responsible for the better performance of the near-IR models. The chemical basis of model selectivity was examined for each model by using a pure component selectivity analysis combined with analysis of the net analyte signal for each solute. This selectivity analysis showed that models based on either near-IR or Raman spectra had excellent selectivity for the targeted analyte. The net analyte signal analysis also revealed that analytical sensitivity was higher for the models generated from near-IR spectra. This is consistent with the lower standard errors of prediction.  相似文献   

5.
《Analytical letters》2012,45(6):1227-1251
Abstract

In order to reduce data nonlinearity and overfitting with the multivariate calibration model y=Xb, a modified Tikhonov regularization (TR) algorithm is evaluated for selecting key variables from an X augmented with extra columns that contain the original measured variables (x ij ) as squared terms (x ij 2) and other orders. The TR approach simultaneously develops the multivariate calibration model. The new generalized pair‐correlation method (GPCM) is also studied for variable selection followed by partial least squares (PLS) for multivariate calibration. Results from synthetic spectral data are compared when using the modified TR approach, GPCM, and PLS without variable selection. The GPCM usually performs slightly better than the TR approach for tabulated bias and variance measures and in some cases, at a sacrifice to parsimony. The method of PLS without variable selection performs the worst. By using synthetic spectral data sets, how the methods work could be studied. Thus, results from this study will aid investigators of real spectral data sets exhibiting nonlinear behavior.  相似文献   

6.
Interlaboratory comparisons are a fundamental task in order to provide measurements with traceability. The simplest possible scenario implies that a single traveling standard of a quantity is measured at various laboratories. A more complex scenario arises when the laboratories measure a large set of standard values pertaining to a given physical quantity or when the traveling standard is not a realization of the quantity of interest but a measuring instrument. In the last case, it might be convenient to globally compare the calibration curves provided by the laboratories. We will introduce a distance between two generic analytical curves based on the Least Power L p norm of their difference. The properties of such distance will be presented, with particular attention to its dependence on the parameter p.  相似文献   

7.
基于多模型(模型融合)建模的思想,开发了两种新的叠加多元校正分析算法:叠加PCR(PLS)多元校正分析和叠加移动窗口PCR(PLS)多元校正分析。与一般的多模型建模方法不同的是其通过赋予光谱数据中的不同部分不同权重叠加子多元校正模型。因此,其可以通过权重调节或选择变量。在消除光谱数据中常见的冗余信息的同时,避免信息遗漏的缺点,并最终提高模型的稳健性,简化了模型。对于这两个新的算法,尽管其具体步骤不同,但仍取得了相似的预测结果。本文通过两套近红外光谱文献数据计算验证了这两个新方法的优越性。  相似文献   

8.
Modeling quantitative structure-activity relationships (QSAR) is considered with an emphasis on prediction. An abundance of methods are available to develop such models. Using a harmonious approach that balances the bias and variance of predictions, the best calibration models are identified relative to the bias and variance criteria used. Criteria utilized to determine the adequacy of models are the root mean square error of calibration (RMSEC) and validation (RMSEV), respective R2 values, and the norm of the regression vector. QSAR data from the literature are used to demonstrate concepts. For these data sets and criteria used, it is suggested that models obtained by ridge regression (RR) are more harmonious and parsimonious than models obtained by partial least squares (PLS) and principal component regression (PCR) when the data is mean-centered. The most harmonious RR models have the best bias/variance tradeoff, reflected by the smallest RMSEC, RMSEV, and regression vector norms and the largest calibration and validation R2 values. The most parsimonious RR models have the smallest effective rank.  相似文献   

9.
A new calibration method was developed and applied to inductively coupled plasma atomic emission spectrometry. External calibration was performed as follows. A container was filled with a given volume of deionized (V p) water. Then a concentrated standard was introduced at a controlled rate (Q e) into the tank by means of a peristaltic pump. The resulting solution was stirred throughout the experiment. Simultaneously, the solution inside the tank was pumped from the vessel to the plasma at a given rate (Q s). The signal was continuously recorded. The variation of the concentration of the solution leaving the tank with time was determined by applying a basic equation of stirred tanks. The representation of the emission intensity versus the time and the further conversion of the time scale into a concentration scale gave rise to the calibration line. The best results in terms of linearity were achieved for V p=15 cm3, Q e=0.6–0.75 ml min−1 and Q s=1–1.2 ml min−1. Graphs with more than 40 standards were obtained within about 10 min. The results found were not statistically different from those afforded by the conventional calibration method. In addition, the new method was faster and supplied better linearity and precision than the conventional one. Another advantage of the stirred tank was that procedures such as dynamic calibration and standard additions could be easily and quickly applied, thus shortening the analysis time. A complete analysis following these procedures based on the measurement of 30 standards took about 5 min. Several synthetic as well as certified samples (i.e., bovine liver, mussel tissue and powdered milk) were analyzed with the stirred tank by applying four different calibration methodologies (i.e., external calibration, internal calibration, standard additions and a combination of internal standardization and standard additions), with the combination of internal standardization and standard additions being the method that provided the best results. The element concentrations obtained were not significantly different from the actual or certified values.  相似文献   

10.
Two spectrophotometric methods for the determination of Ethinylestradiol (ETE) and Levonorgestrel (LEV) by using the multivariate calibration technique of partial least square (PLS) and principal component regression (PCR) are presented. In this study the PLS and PCR are successfully applied to quantify both hormones using the information contained in the absorption spectra of appropriate solutions. In order to do this, a calibration set of standard samples composed of different mixtures of both compounds has been designed. The results found by application of the PLS and PCR methods to the simultaneous determination of mixtures, containing 4–11 μg ml−1 of ETE and 2–23 μg ml−1 of LEV, are reported. Five different oral contraceptives were analyzed and the results were very similar to that obtained by a reference liquid Chromatographic method.  相似文献   

11.
New multivariate calibration methods and other processes are being developed that require selection of multiple tuning parameter (penalty) values to form the final model. With one or more tuning parameters, using only one measure of model quality to select final tuning parameter values is not sufficient. Optimization of several model quality measures is challenging. Thus, three fusion ranking methods are investigated for simultaneous assessment of multiple measures of model quality for selecting tuning parameter values. One is a supervised learning fusion rule named sum of ranking differences (SRD). The other two are non-supervised learning processes based on the sum and median operations. The effect of the number of models evaluated on the three fusion rules are also evaluated using three procedures. One procedure uses all models from all possible combinations of the tuning parameters. To reduce the number of models evaluated, an iterative process (only applicable to SRD) is applied and thresholding a model quality measure before applying the fusion rules is also used. A near infrared pharmaceutical data set requiring model updating is used to evaluate the three fusion rules. In this case, calibration of the primary conditions is for the active pharmaceutical ingredient (API) of tablets produced in a laboratory. The secondary conditions for calibration updating is for tablets produced in the full batch setting. Two model updating processes requiring selection of two unique tuning parameter values are studied. One is based on Tikhonov regularization (TR) and the other is a variation of partial least squares (PLS). The three fusion methods are shown to provide equivalent and acceptable results allowing automatic selection of the tuning parameter values. Best tuning parameter values are selected when model quality measures used with the fusion rules are for the small secondary sample set used to form the updated models. In this model updating situation, evaluation of all possible models, thresholding, and iterative SRD performed equivalently for the three fusion rules with TR and PLS performed worse. While the application is model updating, the fusion processes are applicable to other situations requiring selection of multiple tuning parameter values.  相似文献   

12.
A novel alternative for the simultaneous determination of compounds with similar structure is described, using the whole chemiluminescence-time profiles, acquired by the stopped-flow technique, in combination with mathematical treatments of multivariate calibration. The proposed method is based on the chemiluminescent oxidation of morphine and naloxone by their reaction with potassium permanganate in an acidic medium, using formaldehyde as co-factor. The whole chemiluminescence-time profiles, acquired using the stopped-flow technique in a continuous-flow system, allowed the use of the time-resolved chemiluminescence (CL) data in combination with multivariate calibration techniques, as partial least squares (PLS), for the quantitative determination of both opiate narcotics in binary mixtures.In order to achieve overcoat the additivity of the CL profiles and beside to obtain CL profiles for each drug the most separated as possible in the time, the optimum chemical conditions for the CL emission were investigated. The effect of common emission enhancers on the CL emission obtained in the oxidation reaction of these compounds in different acidic media was studied. The parameters selected were sulphuric acid 1.0 mol L−1, permanganate 0.2 mmol L−1 and formaldehyde 0.8 mol L−1. A calibration set of standard samples was designed by combination of a factorial design, with three levels for each factor and a central composite design. Finally, with the aim of validating the chemometric proposed method, a prediction set of binary samples was prepared. Using the multivariate calibration method proposed, the analytes were determined in synthetic samples, obtaining recoveries of 97-109%.  相似文献   

13.
The study demonstrates an application of the front-face fluorescence spectroscopy combined with multivariate regression methods to the analysis of fluorescent beer components. Partial least-squares regressions (PLS1, PLS2, and N-way PLS) were utilized to develop calibration models between synchronous fluorescence spectra and excitation-emission matrices of beers, on one hand, and analytical concentrations of riboflavin and aromatic amino acids, on the other hand. The best results were obtained in the analysis of excitation-emission matrices using the N-way PLS2 method. The respective correlation coefficients, and the values of the root mean-square error of cross-validation (RMSECV), expressed as percentages of the respective mean analytic concentrations, were: 0.963 and 14% for riboflavin, 0.974 and 4% for tryptophan, 0.980 and 4% for tyrosine, and 0.982 and 19% for phenylalanine.  相似文献   

14.
We present a novel algorithm for linear multivariate calibration that can generate good prediction results. This is accomplished by the idea of that testing samples are mixed by the calibration samples in proper proportion. The algorithm is based on the mixed model of samples and is therefore called MMS algorithm. With both theoretical support and analysis of two data sets, it is demonstrated that MMS algorithm produces lower prediction errors than partial least squares (PLS2) model, has similar prediction performance to PLS1. In the anti-interference test of background, MMS algorithm performs better than PLS2. At the condition of the lack of some component information, MMS algorithm shows better robustness than PLS2.  相似文献   

15.
Ni Xin  Qinghua Meng  Yizhen Li  Yuzhu Hu 《中国化学》2011,29(11):2533-2540
This paper indicates the possibility to use near infrared (NIR) spectral similarity as a rapid method to estimate the quality of Flos Lonicerae. Variable selection together with modelling techniques is utilized to select representative variables that are used to calculate the similarity. NIR is used to build calibration models to predict the bacteriostatic activity of Flos Lonicerae. For the determination of the bacteriostatic activity, the in vitro experiment is used. Models are built for the Gram‐positive bacteria and also for the Gram‐negative bacteria. A genetic algorithm combined with partial least squares regression (GA‐PLS) is used to perform the calibration. The results of GA‐PLS models are compared to interval partial least squares (iPLS) models, full‐spectrum PLS and full‐spectrum principal component regression (PCR) models. Then, the variables in the two GA‐PLS models are combined and then used to calculate the NIR spectral similarity of samples. The similarity based on the characteristic variables and full spectrum is used for evaluating the fingerprints of Flos Lonicerae, respectively. The results show that the combination of variable selection method, modelling techniques and similarity analysis might be a powerful tool for quality control of traditional Chinese medicine (TCM).  相似文献   

16.
In this work, multivariable calibration models based on middle- and near-infrared spectroscopy were developed in order to determine the content of biodiesel in diesel fuel blends, considering the presence of raw vegetable oil. Soybean, castor and used frying oils and their corresponding esters were used to prepare the blends with conventional diesel. Results indicated that partial least squares (PLS) models based on MID or NIR infrared spectra were proven suitable as practical analytical methods for predicting biodiesel content in conventional diesel blends in the volume fraction range from 0% to 5%. PLS models were validated by independent prediction set and the RMSEPs were estimated as 0.25 and 0.18 (%, v/v). Linear correlations were observed for predicted vs. observed values plots with correlation coefficient (R) of 0.986 and 0.994 for the MID and NIR models, respectively. Additionally, principal component analysis (PCA) in the MID region 1700 to 1800 cm− 1 was suitable for identifying raw vegetable oil contaminations and illegal blends of petrodiesel containing the raw vegetable oil instead of ester.  相似文献   

17.
A method for calibration and validation subset partitioning   总被引:13,自引:0,他引:13  
This paper proposes a new method to divide a pool of samples into calibration and validation subsets for multivariate modelling. The proposed method is of value for analytical applications involving complex matrices, in which the composition variability of real samples cannot be easily reproduced by optimized experimental designs. A stepwise procedure is employed to select samples according to their differences in both x (instrumental responses) and y (predicted parameter) spaces. The proposed technique is illustrated in a case study involving the prediction of three quality parameters (specific mass and distillation temperatures at which 10 and 90% of the sample has evaporated) of diesel by NIR spectrometry and PLS modelling. For comparison, PLS models are also constructed by full cross-validation, as well as by using the Kennard-Stone and random sampling methods for calibration and validation subset partitioning. The obtained models are compared in terms of prediction performance by employing an independent set of samples not used for calibration or validation. The results of F-tests at 95% confidence level reveal that the proposed technique may be an advantageous alternative to the other three strategies.  相似文献   

18.
19.
A differential spectrophotometric method has been developed for the simultaneous quantitative determination of glucose (GLU), fructose (FRU) and lactose (LAC) in food samples. It relies on the different kinetic rates of the analytes in their oxidative reaction with potassium ferricyanide (K3Fe(CN)6) as the oxidant. The reaction data were recorded at the analytical wavelength (420 nm) of the K3Fe(CN)6 spectrum. Since the kinetic runs of glucose, fructose and lactose overlap seriously, the condition number was calculated for the data matrix to assist with the optimisation of the experimental conditions. Values of 80 °C and 1.5 mol l−1 were selected for the temperature and concentration of sodium hydroxide (NaOH), respectively. Linear calibration graphs were obtained in the concentration range of 2.96-66.7, 3.21-67.1 and 4.66-101 mg l−1 for glucose, fructose and lactose, respectively. Synthetic mixtures of the three reducing sugar were analysed, and the data obtained were processed by chemometrics methods, such as partial least square (PLS), principal component regression (PCR), classical least square (CLS), back propagation-artificial neural network (BP-ANN) and radial basis function-artificial neural network (RBF-ANN), using the normal and the first-derivative kinetic data. The results show that calibrations based on first-derivative data have advantages for the prediction of the analytes and the RBF-ANN gives the lowest prediction errors of the five chemometrics methods. Following the validation of the proposed method, it was applied for the determination of the three reducing sugars in several commercial food samples; and the standard addition method yielded satisfactory recoveries in all instances.  相似文献   

20.
《Analytical letters》2012,45(9):1967-1977
Abstract

Organophosphorus pesticides, such as parathion methyl (PTM), fenitrothion (FT), parathion (PT), and isocarbophos (ICP), have sensitive but overlapped voltammetric peaks with peak potentials ?309, ?364, ?317, and ?480 mV, respectively, in Britton‐Robinson buffer of pH 4.8 by application of linear sweep stripping voltammetry (LSSV). In this work, two multivariate calibration methods, partial least squares (both PLS‐1 and PLS‐2), and principal component regression (PCR), were applied to quantitatively resolve the overlapping voltammogram of the mixtures of these four pesticides. The prediction results obtained from a set of independent test samples showed that PLS‐1 method performed better prediction ability than PLS‐2 and PCR methods. The proposed method was successfully applied to the determination of these four pesticides in grain samples after a pre‐extraction step with a solvent of acetone.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号