首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 843 毫秒
1.
Optimized sample-weighted partial least squares   总被引:2,自引:0,他引:2  
Lu Xu 《Talanta》2007,71(2):561-566
In ordinary multivariate calibration methods, when the calibration set is determined to build the model describing the relationship between the dependent variables and the predictor variables, each sample in the calibration set makes the same contribution to the model, where the difference of representativeness between the samples is ignored. In this paper, by introducing the concept of weighted sampling into partial least squares (PLS), a new multivariate regression method, optimized sample-weighted PLS (OSWPLS) is proposed. OSWPLS differs from PLS in that it builds a new calibration set, where each sample in the original calibration set is weighted differently to account for its representativeness to improve the prediction ability of the algorithm. A recently suggested global optimization algorithm, particle swarm optimization (PSO) algorithm is used to search for the best sample weights to optimize the calibration of the original training set and the prediction of an independent validation set. The proposed method is applied to two real data sets and compared with the results of PLS, the most significant improvement is obtained for the meat data, where the root mean squared error of prediction (RMSEP) is reduced from 3.03 to 2.35. For the fuel data, OSWPLS can also perform slightly better or no worse than PLS for the prediction of the four analytes. The stability and efficiency of OSWPLS is also studied, the results demonstrate that the proposed method can obtain desirable results within moderate PSO cycles.  相似文献   

2.
以普通玉米籽粒为试验材料,在应用遗传算法结合偏最小二乘回归法对近红外光谱数据进行特征波长选择的基础上,应用偏最小二乘回归法建立了特征波长测定玉米籽粒中淀粉含量的校正模型.试验结果表明,基于11个特征波长所建立的校正模型,其校正误差(RMSEC)、交叉检验误差(RMSECV)和预测误差(RMSEP)分别为0.30%、0.35%和0.27%,校正数据集和独立的检验数据集的预测值与实际测定值之间的相关系数分别达到0.9279和0.9390,与全光谱数据所建立的预测模型相比,在预测精度上均有所改善,表明应用遗传算法和PLS进行光谱特征选择,能获得更简单和更好的模型,为玉米籽粒中淀粉含量的近红外测定和红外光谱数据的处理提供了新的方法与途径.  相似文献   

3.
Based on a so-called ensemble strategy, an algorithm is proposed for near-infrared (NIR) spectral calibration of complex beverage samples. This algorithm is a combination of a novel training set/test set sample-selection procedure based on a Kohonen self-organizing map (SOM) with a simple procedure to calculate an average partial least-squares (PLS) calibration model, which is therefore named SOMEPLS. In order to verify the proposed SOMEPLS, two NIR beverage datasets involving the determination of sugar content are considered, and three kinds of reference algorithm, i.e., conventional PLS (CPLS), the Kennard-Stone (KS) algorithm in combination with PLS (KSPLS), and sample set partitioning based on the joint x-y distance (SPXY) algorithm in combination with PLS (SPXYPLS), are used. Of these, both KS and SPXY are well-known representative sample-selection algorithms. By comparison, it was found that when there is a training set of appropriate size, SOMEPLS can achieve better prediction accuracy than the three reference algorithms, but without increasing the complexity of the corresponding calibration model for the future application, indicating that SOMEPLS can serve as a promising tool for NIR spectral calibration.  相似文献   

4.
傅里叶变换红外光声光谱法测定土壤中有效磷   总被引:3,自引:0,他引:3  
杜昌文  周健民 《分析化学》2007,35(1):119-122
以中国科学院封丘生态实验站长期定位实验区的土样为材料(68样),利用傅里叶转换红外光声光谱测定土壤有效磷:以Olsen-P为因变量,通过傅里转换红外光声光谱构建偏最小二乘法和人工神经网络模型,利用模型进行预测。结果表明,偏最小二乘法模型的相关系数(R2)为0.96,校正标准偏差为1.79mg/kg,验证标准偏差为5.25mg/kg;人工神经网络模型的校正系数为0.84,校正标准偏差为2.40mg/kg,验证标准偏差为5.43mg/kg。两种模型均可以用于土壤有效磷的预测,且偏最小二乘模型优于人工神经网络模型。该方法的特点是无需样品前处理,且测定对样品无破坏,为土壤有效磷的快速测定提供新的手段。  相似文献   

5.
In order to increase the predictive ability of the PLS (Partial Least Squares) model, we have developed a new algorithm, by which uninformative samples which cannot contribute to the model very much are eliminated from a calibration data set. In the proposed algorithm, uninformative wavelength (or independent) variables are eliminated at the first stage by using the modified UVE (Uninformative Variable Elimination)-PLS method that we reported previously. Then, if the prediction error of the ith (1 < or =i< or = n) sample is larger than 3sigma, the corresponding sample is eliminated as uninformative, where n is the total number of calibration samples and sigma is the standard deviation calculated from the other n(-1) samples. Calculation of sigma by the leave-one-out manner enhances the ability to identify the uninformative samples. The final PLS model is constructed precisely because both uninformative wavelength variables and uninformative samples are eliminated. In order to demonstrate the usefulness of the algorithm, we have applied it to two kinds of mid-infrared spectral data sets.  相似文献   

6.
Application of hand scanner in multivariate quantification of povidone-iodine (PVI), as a popular antiseptic agent, in some of pharmaceutical products is presented. Brightness, contrast, and mixed gamma were the adjustable scanner parameters. For selection of optimum values of the scanner parameters, partial least squares (PLS) and multiple linear regression (MLR), coupled with genetic algorithm, were performed. For the selected variables, both MLR and PLS performances were similar and appropriate. From the results obtained, it was concluded that the simpler method of MLR could be successfully applied instead of PLS, which requires more statistical experience. The considered concentration range for PVI in the calibration and prediction samples was 0.0-10.0% (w/v). For the analysis of pharmaceutical samples, generalized standard addition method (GSAM) was applied (on the variables selected by GA) and desirable results were obtained. Relative standard error (RSE) of less than 8% was obtained for the majority of samples analyzed.  相似文献   

7.
Glycerol monolaurate (GML) products contain many impurities, such as lauric acid and glucerol. The GML content is an important quality indicator for GML production. A hybrid variable selection algorithm, which is a combination of wavelet transform (WT) technology and modified uninformative variable eliminate (MUVE) method, was proposed to extract useful information from Fourier transform infrared (FT-IR) transmission spectroscopy for the determination of GML content. FT-IR spectra data were compressed by WT first; the irrelevant variables in the compressed wavelet coefficients were eliminated by MUVE. In the MUVE process, simulated annealing (SA) algorithm was employed to search the optimal cutoff threshold. After the WT-MUVE process, variables for the calibration model were reduced from 7366 to 163. Finally, the retained variables were employed as inputs of partial least squares (PLS) model to build the calibration model. For the prediction set, the correlation coefficient (r) of 0.9910 and root mean square error of prediction (RMSEP) of 4.8617 were obtained. The prediction result was better than the PLS model with full-spectra data. It was indicated that proposed WT-MUVE method could not only make the prediction more accurate, but also make the calibration model more parsimonious. Furthermore, the reconstructed spectra represented the projection of the selected wavelet coefficients into the original domain, affording the chemical interpretation of the predicted results. It is concluded that the FT-IR transmission spectroscopy technique with the proposed method is promising for the fast detection of GML content.  相似文献   

8.
Two-dimensional correlation spectroscopy (2DCOS) and near-infrared spectroscopy (NIRS) were used to determine the polyphenol content in oat grain. A partial least squares (PLS) algorithm was used to perform the calibration. A total of 116 representative oat samples from four locations in China were prepared and the corresponding near-infrared spectra were measured. Two-dimensional correlation spectroscopy was employed to select wavelength bands for the PLS regression model for the polyphenol determination. The number of PLS components and intervals was optimized according to the coefficients of determination (R2) and root mean square error of cross validation (RMSECV) in the calibration set. The performance of the final model was evaluated using the correlation coefficient (R) and the root mean square error of validation (RMSEV) in the prediction set. The results showed the band corresponding to the optimal calibration model was between 1350 and 1848?nm and the optimal spectral preprocessing combination was second derivative with second smoothing. The optimal regression model was obtained with an R2 of 0.8954 and an RMSECV of 0.06651 in the calibration set and R of 0.9614 and RMSEV of 0.04573 in the prediction set. These measurements reveal the calibration model had qualified predictive accuracy. The results demonstrated that the 2DCOS with PLS was a simple and rapid method for the quantitative determination of polyphenols in oats.  相似文献   

9.
Coscione AR  de A  Poppi RJ 《The Analyst》2002,127(1):135-139
Real samples were used for PLS model calibration and validation steps, showing that this approach can be of value in preventing deviations in the results caused by the matrix effects for the simultaneous spectrophotometric determination of aluminum and iron in plant extracts. One hundred UV-vis spectra, obtained from samples of the 1997 to 2000 International Plant-Analytical Exchange (IPE) program (The Netherlands), were used for model development, with ICP-AES aluminum and iron determinations as reference values for model calculation. The plant extracts were analyzed both by ICP-AES and by the PLS models developed in this work, using calibrations with both aqueous standard solutions and with real sample extracts. In addition, since the use of smaller calibration sets could be of value in reducing both the cost and the time of analysis, sets with fewer calibration samples were also investigated, with the help of the Kennard and Stone algorithm for sample selection. Comparison of the predictability of the best model obtained with each calibration set was made using the ratio of their relative root mean square error (%RMSEV) for samples in the validation set, for aluminum or iron determinations, and were compared against F-test tabulated values. For all the models developed with real samples, the differences in the %RMSEV values for the aluminum or iron determinations were found not to be statistically significant, at a confidence level of 95%. Although it was observed that the aluminum, but not the iron, determinations with the PLS 2 model prepared with aqueous standards tend to be slightly lower than the ICP-AES determinations, this model has a good global prediction ability, as observed through the correlation curves presented, and can be used for screening determinations or for other agricultural purposes.  相似文献   

10.
Urea biosensors based on urease immobilized by crosslinking with BSA and glutharaldehyde coupled to ammonium ion-selective electrodes were included in arrays together with potassium, sodium and ammonium PVC membrane ion-selective electrodes. Multivariate calibration models based on PCR and PLS2 were built and tested for the simultaneous determination of urea and potassium. The results show that it is possible to obtain PCR and PLS2 calibration models for simultaneous determination of these two species, based on a very small set of calibration samples (nine samples). Coupling of biosensors with ion-selective electrodes in arrays of sensors raises a few problems related to the limited stability of response and unidirectional cross-talk of the biosensors, and this matter was also subjected to investigation in this work. Up to three identical urea biosensors were included in the arrays, and the data analysis procedure allowed the assessment of the relative performance of the sensors. The results show that at least two urea biosensors should be included in the array to improve urea determination. The prediction errors of the concentration of urea and potassium in the blood serum samples analyzed with this array and a PLS2 calibration model, based on nine calibration samples, were lower than 10 and 5%, respectively.  相似文献   

11.
《Analytical letters》2012,45(9):1967-1977
Abstract

Organophosphorus pesticides, such as parathion methyl (PTM), fenitrothion (FT), parathion (PT), and isocarbophos (ICP), have sensitive but overlapped voltammetric peaks with peak potentials ?309, ?364, ?317, and ?480 mV, respectively, in Britton‐Robinson buffer of pH 4.8 by application of linear sweep stripping voltammetry (LSSV). In this work, two multivariate calibration methods, partial least squares (both PLS‐1 and PLS‐2), and principal component regression (PCR), were applied to quantitatively resolve the overlapping voltammogram of the mixtures of these four pesticides. The prediction results obtained from a set of independent test samples showed that PLS‐1 method performed better prediction ability than PLS‐2 and PCR methods. The proposed method was successfully applied to the determination of these four pesticides in grain samples after a pre‐extraction step with a solvent of acetone.  相似文献   

12.
This study compares the performance of partial least squares (PLS) regression analysis and artificial neural networks (ANN) for the prediction of total anthocyanin concentration in red-grape homogenates from their visible-near-infrared (Vis-NIR) spectra. The PLS prediction of anthocyanin concentrations for new-season samples from Vis-NIR spectra was characterised by regression non-linearity and prediction bias. In practice, this usually requires the inclusion of some samples from the new vintage to improve the prediction. The use of WinISI LOCAL partly alleviated these problems but still resulted in increased error at high and low extremes of the anthocyanin concentration range. Artificial neural networks regression was investigated as an alternative method to PLS, due to the inherent advantages of ANN for modelling non-linear systems. The method proposed here combines the advantages of the data reduction capabilities of PLS regression with the non-linear modelling capabilities of ANN. With the use of PLS scores as inputs for ANN regression, the model was shown to be quicker and easier to train than using raw full-spectrum data. The ANN calibration for prediction of new vintage grape data, using PLS scores as inputs, was more linear and accurate than global and LOCAL PLS models and appears to reduce the need for refreshing the calibration with new-season samples. ANN with PLS scores required fewer inputs and was less prone to overfitting than using PCA scores. A variation of the ANN method, using carefully selected spectral frequencies as inputs, resulted in prediction accuracy comparable to those using PLS scores but, as for PCA inputs, was also prone to overfitting with redundant wavelengths.  相似文献   

13.
This paper reports the results of a rapid method to determine sucrose in chocolate mass using near infrared spectroscopy (NIRS). We applied a broad-based calibration approach, which consists in putting together in one single calibration samples of various types of chocolate mass. This approach increases the concentration range for one or more compositional parameters, improves the model performance and requires just one calibration model for several recipes. The data were modelled using partial least squares (PLS) and multiple linear regression (MLR). The MLR models were developed using a variable selection based on the coefficient regression of PLS and genetic algorithm (GA). High correlation coefficients (0.998, 0.997, 0.998 for PLS, MLR and GA-MLR, respectively) and low prediction errors confirms the good predictability of the models. The results show that NIR can be used as rapid method to determine sucrose in chocolate mass in chocolate factories.  相似文献   

14.
Chung H  Cho S  Toyoda Y  Nakano K  Maeda M 《The Analyst》2006,131(5):684-691
A new quantitative calibration algorithm, called "Moment Combined Partial Least Squares (MC-PLS)", which combines the moment of spectrum and conventional PLS was proposed. Its calibration performance was evaluated for the analyses of three import petroleum and petrochemical products: gasoline, naphtha and polyol samples. The selected properties for these products included the research octane number (RON) and Reid vapor pressure (RVP) for gasoline, the distillation temperature at 10% (D 10%) for naphtha and the hydroxyl (OH) number for polyol. The major concept presented here used the moment to find the closest spectrum of a sample in a given dataset, and generate the difference spectrum and the corresponding difference in the property. These difference spectra and property differences were then used for PLS calibration. The moment has been employed in spectroscopic fields as a simple and effective "spectral feature characteristic" using just a few scalar values (moments). MC-PLS showed improved prediction performance over PLS for each case. In MC-PLS, the difference spectra generated using the moments were used as explained; therefore, additional detail in spectral variations can be utilized for calibrations. Additionally, the difference in the property was employed as reference data, so that its variation range was smaller when compared with that of the original property. Consequently, the MC-PLS performance could be better since the feature-enhanced spectra were used to model a narrower range of property variations. In the case of the D 10% prediction for naphtha, a non-linear prediction pattern that occurred in conventional PLS was effectively corrected using the MC-PLS method.  相似文献   

15.
石油焦中微量元素对其作为预焙阳极的性能起着决定性的作用。首先,通过基于LIBS光谱构建用于石油焦中铁(Fe)和铜(Cu)定量分析的PLS校正模型。然后,考察了不同光谱预处理(归一化、多元散射校正、标准正态变换、一阶导数和二阶导数)以及变量选择算法(粒子群优化算法和变量重要性投影)对PLS校正模型预测性能的影响。建立了一种基于激光诱导击穿光谱(Laser-induced breakdown spectroscopy, LIBS)结合偏最小二乘(Partial least squares, PLS)的石油焦中微量元素定量分析方法。结果显示,与其他PLS校正模型相比,基于二阶导数和变量重要性投影的PLS模型对Fe的预测性能最优,最优的交叉验证相关系数(R-squared cross validation,R2cv)为0.966 7,均方根误差(Root mean squared error cross validation, RMSEcv)为10.282 1 mg/kg,预测集的相关系数(R-squared prediction,R2p)为0.86...  相似文献   

16.
邵学广  陈达  徐恒  刘智超  蔡文生 《中国化学》2009,27(7):1328-1332
偏最小二乘法(PLS)在近红外光谱(NIR)定量分析中占有重要地位,但预测结果往往容易受到样本分组和奇异样本等因素的影响,稳健性不强。多模型PLS (EPLS)方法在模型稳健性上得到提高,然而它无法识别样本中存在的奇异样本。为了同时提高模型的预测准确性和稳健性,本文提出了一种根据取样概率重新取样的多模型PLS方法,称为稳健共识PLS(RE-PLS)方法。该方法通过迭代赋权偏最小二乘法(IRPLS)计算样本回归残差得到每个校正集样本的取样概率,然后根据样本的取样概率来选择训练子集建立多个PLS模型,最后将所有PLS模型的预测结果平均作为最终预测结果。该方法用于两种不同植物样品的近红外光谱建模,并与传统的PLS及EPLS方法进行比较。结果表明该方法可以有效的避免校正集中奇异样本对模型的影响,同时可以提高预测精确度和稳健性。对于含有较多奇异样本的,复杂近红外光谱烟草实际样本,利用简单PLS或者EPLS方法建模预测效果不是很理想,而RE-PLS凭借其独特优势则有望在这种复杂光谱定量分析中得到广泛的应用。  相似文献   

17.
We developed a method for determination of ascorbic acid in pharmaceutical preparations containing various excipients by using near infrared diffuse reflectance spectroscopy and two different calibration methods, viz. stepwise multiple linear regression (SMLR) and partial least-squares (PLS) regression, which provided comparable results and resulted in prediction errors of 1-2%. However, the PLS method provided somewhat better results with the more complex samples.  相似文献   

18.
Piecewise direct standardization (PDS) is applied to multivariate standardization of fluorescence signals using partial least squares (PLS) and principal component regression (PCR) as the calibration models. The multivariate standardization was used to transfer spectra obtained after a step of solid phase extraction (SPE) to spectra registered in pure solvent in the determination of carbendazim, fuberidazole and thiabendazole in water samples. The influential parameters, such as tolerance, window size and the number of samples of the standardization subset were optimized by means of the root mean squared error of prediction (RMSEP). Similar RMSEP values were obtained by PLS and PCR using the optimized influential parameters in the standardization. However, better predictions of the compounds were obtained in test set by the PLS model.  相似文献   

19.
Sample selection is often used to improve the cost-effectiveness of near-infrared (NIR) spectral analysis. When raw NIR spectra are used, however, it is not easy to select appropriate samples, because of background interference and noise. In this paper, a novel adaptive strategy based on selection of representative NIR spectra in the continuous wavelet transform (CWT) domain is described. After pretreatment with the CWT, an extension of the Kennard–Stone (EKS) algorithm was used to adaptively select the most representative NIR spectra, which were then submitted to expensive chemical measurement and multivariate calibration. With the samples selected, a PLS model was finally built for prediction. It is of great interest to find that selection of representative samples in the CWT domain, rather than raw spectra, not only effectively eliminates background interference and noise but also further reduces the number of samples required for a good calibration, resulting in a high-quality regression model that is similar to the model obtained by use of all the samples. The results indicate that the proposed method can effectively enhance the cost-effectiveness of NIR spectral analysis. The strategy proposed here can also be applied to different analytical data for multivariate calibration.  相似文献   

20.
This work describes a novel experimental design aimed at building a calibration set constituted by samples containing a different number of components. The algorithm performs a reiteration process to maintain the number of samples at the lower value as possible and to ensure an homogeneous presence of all the concentration levels. The mixture design was applied to a drug system composed by one-to-four components in different combination. The resolution of the system was performed by three multivariate UV spectrophotometric methods utilizing principal component regression (PCR) and partial last squares (PLS1 and PLS2) algorithms. The calibration set was composed by 61 references on four concentration levels, including 15 samples for each quaternary, ternary and binary composition and 16 one-component samples. The calibration models were optimized through a careful selection of number of factors and wavelength zones, in such a way as to remove interferences from instrumental noise and excipients present in the pharmaceutical formulations. The prediction power of the regression models were verified and compared by analysis of an external prediction set. The models were finally used to assay pharmaceutical specialities containing the studied drugs in one-to-four formulations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号