首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
组合偏最小二乘回归方法在近红外光谱定量分析中的应用   总被引:3,自引:1,他引:3  
成忠  诸爱士  陈德钊 《分析化学》2007,35(7):978-982
针对近红外光谱数据局部效应显著,变量个数多,彼此间常存在严重的复共线性,并多与样品组分含量呈非线性关系,构建一种组合非线性偏最小二乘回归(E-S-QPLSR)方法。它采用无重复采样技术(subag-ging),从训练样本中生成若干子样,然后每个子样通过二次多项式偏最小二乘回归(QPLSR),建立其子模型,并实现对训练样本因变量的定量预测,再将它们交由线性PLS算法用于计算各子模型的组合权系数。将该法应用于80个玉米样品的水组分含量与其近红外光谱的定量关系建模,效果良好,显示出很强的学习能力,所建模型的预报性能也优于其它方法。  相似文献   

2.
选取甲基对硫磷和水胺硫磷为研究对象,改良了传统的QuEChERS前处理工艺,以自制纳米金溶胶为增强基底,利用表面增强拉曼光谱(SERS)技术,对茶叶浸出液中的农药残留进行检测。通过比对两种有机磷农药的拉曼特征峰进行定性分析。同时,选取570,1034,1107和1202 cm^-1等拉曼位移附近的特征峰光谱数据,利用微分等数学手段,结合偏最小二乘法(PLSR)建立回归方程,预测样品中农药残留含量。所得预测数值与气相色谱-质谱联用(GC-MS)法检测值对比,验证本方法的可行性与可信度。结果表明:基于SERS技术对上述两种有机磷农药的检出限可达0.05 mg/L;通过数学模型分析建立回归方程,其线性相关系数范围为0.9077~0.9824,预测均方根误差(RMSEP)范围为0.77%~2.68%;利用回归方程得到的预测值与GC-MS检测结果基本接近,相对误差范围-5.16%~9.03%,回收率为81.4%~115.1%,说明可以用SERS技术对茶叶浸出液中的有机磷农药残留进行定性和初步定量分析。  相似文献   

3.
In this present research, a spectroscopic method based on UV–Vis spectroscopy is utilized to quantify the level of corn adulteration in peaberry ground roasted coffee by chemometrics. Peaberry coffee with two types of bean processing of wet and dry-processed methods was used and intentionally adulterated by corn with a 10–50% level of adulteration. UV–Vis spectral data are obtained for aqueous samples in the range between 250 and 400 nm with a 1 nm interval. Three multivariate regression methods, including partial least squares regression (PLSR), multiple linear regression (MLR), and principal component regression (PCR), are used to predict the level of corn adulteration. The result shows that all individual regression models using individual wet and dry samples are better than that of global regression models using combined wet and dry samples. The best calibration model for individual wet and dry and combined samples is obtained for the PLSR model with a coefficient of determination in the range of 0.83–0.93 and RMSE below 6% (w/w) for calibration and validation. However, the error prediction in terms of RMSEP and bias were highly increased when the individual regression model was used to predict the level of corn adulteration with differences in the bean processing method. The obtained results demonstrate that the use of the global PLSR model is better in predicting the level of corn adulteration. The error prediction for this global model is acceptable with low RMSEP and bias for both individual and combined prediction samples. The obtained RPDp and RERp in prediction for the global PLSR model are more than two and five for individual and combined samples, respectively. The proposed method using UV–Vis spectroscopy with a global PLSR model can be applied to quantify the level of corn adulteration in peaberry ground roasted coffee with different bean processing methods.  相似文献   

4.
The paper describes linear and nonlinear modeling of the wastewater data for the performance evaluation of an up-flow anaerobic sludge blanket (UASB) reactor based wastewater treatment plant (WWTP). Partial least squares regression (PLSR), multivariate polynomial regression (MPR) and artificial neural networks (ANNs) modeling methods were applied to predict the levels of biochemical oxygen demand (BOD) and chemical oxygen demand (COD) in the UASB reactor effluents using four input variables measured weekly in the influent wastewater during the peak (morning and evening) and non-peak (noon) hours over a period of 48 weeks. The performance of the models was assessed through the root mean squared error (RMSE), relative error of prediction in percentage (REP), the bias, the standard error of prediction (SEP), the coefficient of determination (R2), the Nash-Sutcliffe coefficient of efficiency (Ef), and the accuracy factor (Af), computed from the measured and model predicted values of the dependent variables (BOD, COD) in the WWTP effluents. Goodness of the model fit to the data was also evaluated through the relationship between the residuals and the model predicted values of BOD and COD. Although, the model predicted values of BOD and COD by all the three modeling approaches (PLSR, MPR, ANN) were in good agreement with their respective measured values in the WWTP effluents, the nonlinear models (MPR, ANNs) performed relatively better than the linear ones. These models can be used as a tool for the performance evaluation of the WWTPs.  相似文献   

5.
Raman spectroscopy has been evaluated for characterisation of the degree of fatty acid unsaturation (iodine value) of salmon (Salmo salar). The Norwegian Quality Cuts from 50 salmon samples were obtained, and the samples provided an iodine value range of 147.8-170.0 g I2/100 g fat, reflecting a normal variation of farmed salmon. Raman measurements were performed both on different spots of the intact salmon muscle, on ground salmon samples as well as on oil extracts, and partial least squares regression (PLSR) was utilised for calibration. The oil spectra provided better iodine value predictions than the other data sets, and a correlation coefficient of 0.87 with a root mean square error of cross-validation of 2.5 g I2/100 g fat was achieved using only one PLSR component. The ground samples provided comparable results, but at least two PLSR components were needed. Higher prediction errors were obtained from Raman spectra of intact salmon muscle, and this may partly be explained by sampling uncertainties in the relation between Raman measurements and reference analysis. All PLSR models obtained were based on chemically sound regression coefficients, and thus information regarding fatty acid unsaturation is readily available from Raman spectra even in systems with high contents of protein and water. The accuracy, the robustness and the low complexity of the PLSR models obtained suggest Raman spectroscopy as a promising method for rapid in-process control of the degree of unsaturation in salmon samples.  相似文献   

6.
The use of some unconventional non-linear modeling techniques, i.e. classification and regression trees and multivariate adaptive regression splines-based methods, was explored to model the blood-brain barrier (BBB) passage of drugs and drug-like molecules. The data set contains BBB passage values for 299 structural and pharmacological diverse drugs, originating from a structured knowledge-based database. Models were built using boosted regression trees (BRT) and multivariate adaptive regression splines (MARS), as well as their respective combinations with stepwise multiple linear regression (MLR) and partial least squares (PLS) regression in two-step approaches. The best models were obtained using combinations of MARS with either stepwise MLR or PLS. It could be concluded that the use of combinations of a linear with a non-linear modeling technique results in some improved properties compared to the individual linear and non-linear models and that, when the use of such a combination is appropriate, combinations using MARS as non-linear technique should be preferred over those with BRT, due to some serious drawbacks of the BRT approaches.  相似文献   

7.
The correct recognition of sweet orange (Citrus sinensis L. Osbeck) variety accessions at the nursery stage of growth is a challenge for the productive sector as they do not show any difference in phenotype traits. Furthermore, there is no DNA marker able to distinguish orange accessions within a variety due to their narrow genetic trace. As different combinations of canopy and rootstock affect the uptake of elements from soil, each accession features a typical elemental concentration in the leaves. Thus, the main aim of this work was to analyze two sets of ten different accessions of very close genetic characters of three varieties of fresh citrus leaves at the nursery stage of growth by measuring the differences in elemental concentration by laser-induced breakdown spectroscopy (LIBS). The accessions were discriminated by both principal component analysis (PCA) and a classifier based on the combination of classification via regression (CVR) and partial least square regression (PLSR) models, which used the elemental concentrations measured by LIBS as input data. A correct classification of 95.1% and 80.96% was achieved, respectively, for set 1 and set 2. These results showed that LIBS is a valuable technique to discriminate among citrus accessions, which can be applied in the productive sector as an excellent cost–benefit tool in citrus breeding programs.  相似文献   

8.
Zhu D  Ji B  Meng C  Shi B  Tu Z  Qing Z 《Analytica chimica acta》2007,598(2):227-234
The ν-support vector regression (ν-SVR) was used to construct the calibration model between soluble solids content (SSC) of apples and acousto-optic tunable filter near-infrared (AOTF-NIR) spectra. The performance of ν-SVR was compared with the partial least square regression (PLSR) and the back-propagation artificial neural networks (BP-ANN). The influence of SVR parameters on the predictive ability of model was investigated. The results indicated that the parameter ν had a rather wide optimal area (between 0.35 and 1 for the apple data). Therefore, we could determine the value of ν beforehand and focus on the selection of other SVR parameters. For analyzing SSC of apple, ν-SVR was superior to PLSR and BP-ANN, especially in the case of fewer samples and treating the noise polluted spectra. Proper spectra pretreatment methods, such as scaling, mean center, standard normal variate (SNV) and the wavelength selection methods (stepwise multiple linear regression and genetic algorithm with PLS as its objective function), could improve the quality of ν-SVR model greatly.  相似文献   

9.
预测毛细管区带电泳有效淌度的支持向量回归建模方法   总被引:3,自引:0,他引:3  
康宇飞  瞿海斌  沈朋  程翼宇 《分析化学》2004,32(9):1151-1155
提出预测毛细管电泳迁移行为的支持向量回归建模方法。以核苷为实际研究对象,利用正交试验获得的数据,结合二标记物技术,用支持向量回归算法建立毛细管区带电泳的柱温、电压、缓冲液浓度和pH值与3种核苷的有效淌度之间的相关模型。将其与偏最小二乘回归和人工神经网络方法相比较,结果表明所建模型的预测准确性优于后两者,适宜用于毛细管电泳迁移行为的预测。  相似文献   

10.
A fast, non-destructive and eco-friendly method was developed to simultaneously determine the oil and water contents of soybean based on low field nuclear magnetic resonance(LF-NMR) relaxometry combined with chemometrics, such as partial least squares regression(PLSR). The Carr-Purcell-Meiboom-Gill(CPMG) magnetization decay data of ten soybean samples were acquired by LF-NMR and directly applied to the PLSR analysis. Calibration models were established via PLSR with full cross-validation based on the reference values obtained by the Soxhlet extraction method for measuring oil and oven-drying method for measuring water. The results indicate that the calibration models are satisfactory for both oil and water determinations; the root mean squared errors of cross-validation(RMSECV) for oil and water are 0.2285% and 0.0178%, respectively. Furthermore, the oil and water contents in unknown soybean samples were predicted by the PLSR models and the results were compared with the reference values. The relative errors of the predicted oil and water contents were in ranges of 1.25%-4.96% and 0.44%-2.49%, respectively. These results demonstrate that the combination of LF-NMR relaxometry with chemometrics shows great potential for the simultaneous determination of contents of oil and water in soybean with high accuracy.  相似文献   

11.
In principal component regression (PCR) and partial least‐squares regression (PLSR), the use of unlabeled data, in addition to labeled data, helps stabilize the latent subspaces in the calibration step, typically leading to a lower prediction error. For using unlabeled data in PLSR, a non‐sequential approach based on optimal filtering (OF) has been proposed in the literature. In this work, a sequential version of the OF‐based PLSR and a PCA‐based PLSR (PLSR applied to PCA‐preprocessed data) are proposed. It is shown analytically that the sequential version of the OF‐based PLSR is equivalent to that of PCA‐based PLSR, which leads to a new interpretation of OF. Simulated and experimental data sets are used to point out the usefulness and pitfalls of using unlabeled data. Unlabeled data can replace labeled data to some extent, thereby leading to an economic benefit. However, in the presence of drift, the use of unlabeled data can result in an increase in prediction error compared to that obtained with a model based on labeled data alone. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

12.
Future food supply will become increasingly dependent on edible material extracted from insects. The growing popularity of artisanal food products enhanced by insect proteins creates particular needs for establishing effective methods for quality control. This study focuses on developing rapid and efficient on-site quantitative analysis of protein content in handcrafted insect bars by miniaturized near-infrared (NIR) spectrometers. Benchtop (Büchi NIRFlex N-500) and three miniaturized (MicroNIR 1700 ES, Tellspec Enterprise Sensor and SCiO Sensor) in hyphenation to partial least squares regression (PLSR) and Gaussian process regression (GPR) calibration methods and data fusion concept were evaluated via test-set validation in performance of protein content analysis. These NIR spectrometers markedly differ by technical principles, operational characteristics and cost-effectiveness. In the non-destructive analysis of intact bars, the root mean square error of cross prediction (RMSEP) values were 0.611% (benchtop) and 0.545–0.659% (miniaturized) with PLSR, and 0.506% (benchtop) and 0.482–0.580% (miniaturized) with GPR calibration, while the analyzed total protein content was 19.3–23.0%. For milled samples, with PLSR the RMSEP values improved to 0.210% for benchtop spectrometer but remained in the inferior range of 0.525–0.571% for the miniaturized ones. GPR calibration improved the predictive performance of the miniaturized spectrometers, with RMSEP values of 0.230% (MicroNIR 1700 ES), 0.326% (Tellspec) and 0.338% (SCiO). Furthermore, Tellspec and SCiO sensors are consumer-oriented devices, and their combined use for enhanced performance remains a viable economical choice. With GPR calibration and test-set validation performed for fused (Tellspec + SCiO) data, the RMSEP values were improved to 0.517% (in the analysis of intact samples) and 0.295% (for milled samples).  相似文献   

13.
In this paper, a genetic algorithm‐support vector regression (GA‐SVR) coupled approach was proposed for investigating the relationship between fingerprints and properties of herbal medicines. GA was used to select variables so as to improve the predictive ability of the models. Two other widely used approaches, Random Forests (RF) and partial least squares regression (PLSR) combined with GA (namely GA‐RF and GA‐PLSR, respectively), were also employed and compared with the GA‐SVR method. The models were evaluated in terms of the correlation coefficient between the measured and predicted values (Rp), root mean square error of prediction, and root mean square error of leave‐one‐out cross‐validation. The performance has been tested on a simulated system, a chromatographic data set, and a near‐infrared spectroscopic data set. The obtained results indicate that the GA‐SVR model provides a more accurate answer, with higher Rp and lower root mean square error. The proposed method is suitable for the quantitative analysis and quality control of herbal medicines. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

14.
The intake of tomato glycoalkaloids can exert beneficial effects on human health. For this reason, methods for a rapid quantification of these compounds are required. Most of the methods for α-tomatine and dehydrotomatine quantification are based on chromatographic techniques. However, these techniques require complex and time-consuming sample pre-treatments. In this work, HPLC-ESI-QqQ-MS/MS was used as reference method. Subsequently, multiple linear regression (MLR) and partial least squares regression (PLSR) were employed to create two calibration models for the prediction of the tomatine content from thermogravimetric (TGA) and attenuated total reflectance (ATR) infrared spectroscopy (IR) analyses. These two fast techniques were proven to be suitable and effective in alkaloid quantification (R2 = 0.998 and 0.840, respectively), achieving low errors (0.11 and 0.27%, respectively) with the reference technique.  相似文献   

15.
16.
激光诱导击穿光谱(LIBS)是一种以激光为激发源的等离子体发射光谱分析技术,已有将其用于稀土元素的定量分析研究,但由于稀土矿基体差异大、元素含量低,定量分析灵敏度和准确度仍有待提高。通过使用单激光分束构造双脉冲LIBS系统,并结合偏最小二乘回归(PLSR)算法实现对稀土矿石样品中的稀土元素La、Dy、Yb和Y的定量分析。结果表明,双脉冲LIBS结合PLSR可建立更加稳定的定标模型,与常规基本定标法相比,La、Dy、Yb和Y元素的相对均方根预测误差(RMSEP)从0.0061 %、0.0037%、0.0045%、0.0280 %降低至0.0044%、0.0016%、0.0029%、0.0134%,平均相对预测误差(AREP)从10.88%、15.27%、6.42%、17.20%降低至6.67%、3.62%、4.10%、7.98%。因此,双脉冲LIBS结合PLSR方法可以有效地提高LIBS对稀土矿石中稀土元素的定量分析能力。  相似文献   

17.
An ensemble, a model-independent technique based on combining several models for classification/regression tasks, allows us to achieve a high accuracy that is often not achievable with single models. Such combinations have gained increasing attention in many fields. This paper proposes the use of random subspace (RS)-based regression ensemble as an alternative method for near-infrared (NIR) spectroscopic calibration of tobacco samples. Because of the considerable reduction of variables in a random subspace, multiple linear regression (MLR) is used as the base algorithm and the method is therefore also referred to as RS-MLR. The overall performance of the proposed RS-MLR method is compared to those of partial least square regression (PLSR), kernel principal component regression (KPCR) and kernel partial least square regression (KPLSR). The results reveal that the RS-MLR method not only has a simple concept but also can produce a more parsimonious and more accurate calibration model than PLSR, KPCR and KPLSR, at a lower computational cost. Besides, we also found that the RS-MLR method is very appropriate for the so-called small sample problems and that the calibration models built by RS-MLR are less sensitive to overfitting.  相似文献   

18.
O. Divya 《Talanta》2007,72(1):43-48
Synchronous fluorescence spectroscopy (SFS) is a rapid, sensitive and nondestructive method suitable for the analysis of multifluorophoric mixtures. The present study demonstrates the use of SFS and multivariate methods for the analysis of petroleum products which is a complex mixture of multiple fluorophores. Two multivariate techniques principal component regression (PCR) and partial least square regression (PLSR) have been successfully applied for the classification of petrol-kerosene mixtures. Calibration models were constructed using 35 samples and their validation was carried out with varying composition of petrol and kerosene in the calibration range. The results showed that the method could be used for the estimation of kerosene in kerosene-mixed petrol. The model was found to be sensitive, detecting even 1% contamination of kerosene in petrol.  相似文献   

19.
The non-linear regression technique known as alternating conditional expectations (ACE) method is only applicable when the number of objects available for calibration is considerably greater than the number of considered predictors. Alternating conditional expectations regression with selection of significant predictors by genetic algorithms (GA-ACE), the non-linear regression technique presented here, is based on the ACE algorithm but introducing several modifications to resolve the applicability limitations of the original ACE method, thus facilitating the practical implementation of a very interesting calibration tool. In order to overcome the lack of reliability displayed by the original ACE algorithm when working on data sets characterized by a too large number of variables and prior to the development of the non-linear regression model, GA-ACE applies genetic algorithms as a variable selection technique to select a reduced subset of significant predictors able to accurately model and predict a considered variable response. Furthermore, GA-ACE actually provides two alternative application approaches, since it allows either the performance of prior data compression computing a number of principal components to be subsequently subjected to GA-selection, or working directly on original variables.In this study, GA-ACE was applied to two real calibration problems, with a very low observation/variable ratio (NIR data), and the results were compared with those obtained by several linear regression techniques usually employed. When using the GA-ACE non-linear method, notably improved regression models were developed for the two response variables modeled, with root mean square errors of the residuals in external prediction (RMSEP) equal to 11.51 and 6.03% for moisture and lipid contents of roasted coffee samples, respectively. The improvement achieved by applying the new non-linear method introduced is even more remarkable taking into account the results obtained with the best performance linear method (IPW-PLS) applied to predict the studied responses (14.61 and 7.74% RMSEP, respectively).  相似文献   

20.
This paper proposes the use of the least-squares support vector machine (LS-SVM) as an alternative multivariate calibration method for the simultaneous quantification of some common adulterants (starch, whey or sucrose) found in powdered milk samples, using near-infrared spectroscopy with direct measurements by diffuse reflectance. Due to the spectral differences of the three adulterants a nonlinear behavior is present when all groups of adulterants are in the same data set, making the use of linear methods such as partial least squares regression (PLSR) difficult. Excellent models were built using LS-SVM, with low prediction errors and superior performance in relation to PLSR. These results show it possible to built robust models to quantify some common adulterants in powdered milk using near-infrared spectroscopy and LS-SVM as a nonlinear multivariate calibration procedure.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号