首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
以玉米中水分、蛋白质、脂肪和淀粉4种主要成分含量以及烟叶总植物碱的偏最小二乘近红外光谱(PLS-NIRs)模型传递为例,考察了模型中潜变量个数(nLVs)对模型传递误差的影响。研究发现,根据累积贡献率大于99.9%确定的玉米、烟叶样品PLS-NIRs模型的nLVs分别为1和13,nLVs=1时建立的玉米模型对两台从机样品4个成分的预测值和主机预测值的重现性指标均满足国标要求;nLVs=13时建立的烟叶总植物碱模型经分段直接校正(PDS)后,可使4台从机样品的平均相对预测误差(MRE)小于6%。采用留一交叉验证或四折交叉验证确定的玉米、烟叶PLS-NIRs模型的nLVs分别为5~10,16与19,在这些nLVs下建立的玉米PLS-NIRs模型对从机样品的预测误差显著增大,超过许可的误差范围,且模型即使经PDS校正后,从机样品预测值与主机样品预测值的重现性指标大多不满足国标要求;nLVs13时所建烟叶总植物碱PLS-NIRs模型的转移误差随nLVs增大而增大,且PDS校正后不能保证模型对所有从机样品的MRE小于6%。根据累积贡献率大于99.9%或接近99.9%为准则选取nLVs,可有效避免过拟合,提高NIRs模型的传递性能。  相似文献   

2.
This work concerns the validation of a previously described multivariate method for determining chlorophylls and their corresponding pheopigments. The meaning of the term validation is discussed, and the work is divided into two parts, concerning model validation and method validation. The model validation showed that 40 standards are sufficient to ensure that the Y-domain is adequately spanned, and that differentiation of the data improves the models. The wavelength range was restricted to 510–770 nm, thus, eliminating interfering signals from carotenoids that had not been included in the calibration solutions. This restriction does not affect the predictive ability towards any analytes except pheophytin a. For accurate predictions of pheophytin a the wavelength region between 350 and 415 nm was included in the model. All model evaluations were based on partial least squares regression for one y-variable (PLS1). A criterion used to quantify the performance of the model was the deviation, which is an estimate of variance calculated for predictions of samples, taking into account the model’s predictive ability, the leverage and the x-residuals. In the method validation section, predictions of samples by the proposed method are compared with results obtained using an HPLC reference method. It was found for chlorophyll a that the root mean square error of cross validation (RMSECV) calculated from the model was several times higher than the corresponding root mean square error of prediction (RMSEP) calculated from the HPLC analysis. A likely explanation for this is that the RMSECV is determined in the presence of severely interfering compounds, a desired consequence of spanning the Y-space. Samples were extracted (then measured and predicted) from algal cultures, representing six different taxonomic divisions of phytoplankton. The pigment composition of these species is known, so the analyst knows in advance which chlorophylls are present. Predictions by the models are consistent with a priori knowledge of the pigment composition. To evaluate the potential of these models to deal with data recorded by different instruments, the absorption spectra for a set of samples were registered with two instruments. The results show that there is a minor and negligible bias between the predictions obtained using these instruments, probably due to a slight shift in the wavelengths recorded by them.  相似文献   

3.
The selection abilities of the two well‐known techniques of variable selection, synergy interval‐partial least‐squares (SiPLS) and genetic algorithm‐partial least‐squares (GA‐PLS), have been examined and compared. By using different simulated and real (corn and metabolite) datasets, keeping in view the spectral overlapping of the components, the influence of the selection of either intervals of variables or individual variables on the prediction performances was examined. In the simulated datasets, with decrease in the overlapping of the spectra of components and cases with components of narrow bands, GA‐PLS results were better. In contrast, the performance of SiPLS was higher for data of intermediate overlapping. For mixtures of high overlapping analytes, GA‐PLS showed slightly better performance. However, significant differences between the results of the two selection methods were not observed in most of the cases. Although SiPLS resulted in slightly better performance of prediction in the case of corn dataset except for the prediction of the moisture content, the improvement obtained by SiPLS compared with that by GA‐PLS was not significant. For real data of less overlapped components (metabolite dataset), GA‐PLS that tends to select far fewer variables did not give significantly better root mean square error of cross‐validation (RMSECV), cross‐validated R2 (Q2), and root mean square error of prediction (RMSEP) compared with SiPLS. Irrespective of the type of dataset, GA‐PLS resulted in models with fewer latent variables (LVs). When comparing the computational time of the methods, GA‐PLS is considered superior to SiPLS. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

4.
以普通玉米籽粒为试验材料,在应用遗传算法结合偏最小二乘回归法对近红外光谱数据进行特征波长选择的基础上,应用偏最小二乘回归法建立了特征波长测定玉米籽粒中淀粉含量的校正模型.试验结果表明,基于11个特征波长所建立的校正模型,其校正误差(RMSEC)、交叉检验误差(RMSECV)和预测误差(RMSEP)分别为0.30%、0.35%和0.27%,校正数据集和独立的检验数据集的预测值与实际测定值之间的相关系数分别达到0.9279和0.9390,与全光谱数据所建立的预测模型相比,在预测精度上均有所改善,表明应用遗传算法和PLS进行光谱特征选择,能获得更简单和更好的模型,为玉米籽粒中淀粉含量的近红外测定和红外光谱数据的处理提供了新的方法与途径.  相似文献   

5.
Low‐field 1H NMR was used in this work for the analysis of mixtures involving crude oils and water. CPMG experiments were performed to determine the transverse relaxation time (T2) distribution curves, which were computed by the inverse Laplace transform of the echo decay data. The instrument's ability of quantifying water and petroleum in biphasic mixtures following different methodologies was tested. For mixtures between deionized water and petroleum, one achieved excellent results, with root mean squared error of cross‐validation (RMSECV) of 0.8% for a regression between the water content (wt %) and the relative area of the water peak in the T2 distribution curve, or a standard deviation of 0.9% for the relationship between the water content and the relative water peak area, corrected by the relative hydrogen index of the crude. In the case of biphasic mixtures of Mn2+‐doped water and crude oils, the best result of RMSECV = 1.6% was achieved by using the raw magnetization decay data for a partial least squares regression. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

6.
A combination of kinetic spectroscopic monitoring and multivariate curve resolution-alternating least squares (MCR-ALS) was proposed for the enzymatic determination of levodopa (LVD) and carbidopa (CBD) in pharmaceuticals. The enzymatic reaction process was carried out in a reverse stopped-flow injection system and monitored by UV-vis spectroscopy. The spectra (292-600 nm) were recorded throughout the reaction and were analyzed by multivariate curve resolution-alternating least squares. A small calibration matrix containing nine mixtures was used in the model construction. Additionally, to evaluate the prediction ability of the model, a set with six validation mixtures was used. The lack of fit obtained was 4.3%, the explained variance 99.8% and the overall prediction error 5.5%. Tablets of commercial samples were analyzed and the results were validated by pharmacopeia method (high performance liquid chromatography). No significant differences were found (α = 0.05) between the reference values and the ones obtained with the proposed method. It is important to note that a unique chemometric model made it possible to determine both analytes simultaneously.  相似文献   

7.
Multi-way partial least squares modeling of water quality data   总被引:1,自引:0,他引:1  
A 10 years surface water quality data set pertaining to a polluted river was analyzed using partial least squares (PLS) regression models. Both the unfold-PLS and N-PLS (tri-PLS and quadri-PLS) models were calibrated through leave-one out cross-validation method. These were applied to the multivariate, multi-way data array with a view to assess and compare their predictive capabilities for biochemical oxygen demand (BOD) of river water in terms of their relative mean squares error of cross-validation, prediction and variance captured. The sum of squares of residuals and leverages were computed and analyzed to identify the sites, variables, years and months which may have influence on the constructed model. Both the tri- and quadri-PLS models yielded relatively low validation error as compared to unfold-PLS and captured high variance in model. Moreover, both of these methods produced acceptable model precision and accuracy. In case of tri-PLS the root mean squares errors were 1.65 and 2.17 for calibration and prediction, respectively; whereas these were 2.58 and 1.09 for quadri-PLS. At a preliminary level it seems that BOD can be predicted but a different data arrangement is needed. Moreover, analysis of the scores and loadings plots of the N-PLS models could provide information on time evolution of the river water quality.  相似文献   

8.
Comprehensive two‐dimensional gas chromatography and flame ionization detection combined with unfolded‐partial least squares is proposed as a simple, fast and reliable method to assess the quality of gasoline and to detect its potential adulterants. The data for the calibration set are first baseline corrected using a two‐dimensional asymmetric least squares algorithm. The number of significant partial least squares components to build the model is determined using the minimum value of root‐mean square error of leave‐one out cross validation, which was 4. In this regard, blends of gasoline with kerosene, white spirit and paint thinner as frequently used adulterants are used to make calibration samples. Appropriate statistical parameters of regression coefficient of 0.996–0.998, root‐mean square error of prediction of 0.005–0.010 and relative error of prediction of 1.54–3.82% for the calibration set show the reliability of the developed method. In addition, the developed method is externally validated with three samples in validation set (with a relative error of prediction below 10.0%). Finally, to test the applicability of the proposed strategy for the analysis of real samples, five real gasoline samples collected from gas stations are used for this purpose and the gasoline proportions were in range of 70–85%. Also, the relative standard deviations were below 8.5% for different samples in the prediction set.  相似文献   

9.
Gentiana rigescens is a famous herbal medicine in China for treatment of convulsion, rheumatism, and jaundice. Here, the infrared determination of gentiopicroside, swertiamarin, sweroside, and loganic acid in G. rigescens from different areas and varieties was presented for the first time. Reference information for the iridoids were obtained by high-performance liquid chromatography. Partial least squares was used to characterize the relationship between spectra matrix and concentration vector for the determination of the analytes. For determination of gentiopicroside, the appropriate performance of partial least squares model was acquired with coefficient of determination of calibration and coefficient of determination of prediction values of 0.965 and 0.868. The root mean square error of estimation (RMSEE), root mean square error of cross validation (RMSECV), root mean square error of prediction (RMSEP), and residual predictive deviation (RPD) values were 2.612, 5.292, 5.239?mg g?1, and 2.701, respectively, based on the first derivative and multiplicative scatter correction. For determination of the total iridoids, the best results were obtained using the coefficient of determination of calibration and coefficient of determination of prediction of 0.943 and 0.834, RMSEE, RMSECV, RMSEP and RPD of 3.896, 7.536, 6.543?mg g?1 and 2.438, respectively, based on the first derivative. Both models were reliable and robust. The results demonstrated that infrared spectroscopy provided a rapid, low-cost tool to monitor the quality of G. rigescens by the determination of the iridoids.  相似文献   

10.
为探讨光栅型与傅里叶变换型近红外分析仪之间模型传递的应用效果,选取国产鱼粉为近红外光谱样本,DS2500F型近红外分析仪为源仪器,MPA型近红外分析仪为目标仪器,采用分段直接校正(PDS)方法实现近红外光谱传递。分别建立水分、粗蛋白质、粗脂肪、蛋氨酸和赖氨酸等组分的预测模型,通过交互验证决定系数(R2cv)、交互验证标准误差(RMSECV)、马氏距离(MD)、系统偏差(Bias)、预测均方根误差(RMSEP)和相对分析误差(RPD)等参数,多维度评估光谱传递后所建预测模型的效果。结果表明,DS2500F仪器的近红外光谱传递到MPA型仪器时,所建国产鱼粉的水分、粗蛋白质、粗脂肪、蛋氨酸、赖氨酸的预测模型与MPA型仪器原始预测模型各参数对比无显著差异,预测效果基本一致,说明国产鱼粉在DS2500F仪器上的近红外光谱通过传递可以替代MPA型仪器的原始光谱,间接实现了模型传递,且具有良好的适用性和共享性,可提高近红外预测模型的应用效率。  相似文献   

11.
《Analytical letters》2012,45(11):2359-2372
Abstract

Ternary mixtures of nitrophenol isomers have been simultaneously determined in synthetic and real matrix by application of genetic algorithm and partial least squares model. All factors affecting the sensitivity were optimized and the linear dynamic range for determination of nitrophenol isomers found. The simultaneous determination of nitrophenol mixtures by using spectrophotometric methods is a difficult problem, due to spectral interferences. The partial least squares modeling was used for the multivariate calibration of the spectrophotometric data. A genetic algorithm is a suitable method for selecting wavelength for PLS calibration of mixtures with almost identical spectra without loss prediction capacity. The experimental calibration matrix was designed by measuring the absorbance over the range 300–520 nm for 21 samples of 1–20 µg mL?1, 1–20 µg mL?1, and 1–10 µg mL?1 of m‐nitrophenol, o‐nitrophenol, and p‐nitrophenol, respectively. The root mean square error of prediction for m‐nitrophenol, o‐nitrophenol, and p‐nitrophenol with genetic algorithms and without genetic algorithms were 0.3732, 0.5997, 0.3181 and 0.7309, 0.9961, 1.0055, respectively. The proposed method was successfully applied for the determination of m‐nitrophenol, o‐nitrophenol, and p‐nitrophenol in synthetic and water samples.  相似文献   

12.
Two-dimensional correlation spectroscopy (2DCOS) and near-infrared spectroscopy (NIRS) were used to determine the polyphenol content in oat grain. A partial least squares (PLS) algorithm was used to perform the calibration. A total of 116 representative oat samples from four locations in China were prepared and the corresponding near-infrared spectra were measured. Two-dimensional correlation spectroscopy was employed to select wavelength bands for the PLS regression model for the polyphenol determination. The number of PLS components and intervals was optimized according to the coefficients of determination (R2) and root mean square error of cross validation (RMSECV) in the calibration set. The performance of the final model was evaluated using the correlation coefficient (R) and the root mean square error of validation (RMSEV) in the prediction set. The results showed the band corresponding to the optimal calibration model was between 1350 and 1848?nm and the optimal spectral preprocessing combination was second derivative with second smoothing. The optimal regression model was obtained with an R2 of 0.8954 and an RMSECV of 0.06651 in the calibration set and R of 0.9614 and RMSEV of 0.04573 in the prediction set. These measurements reveal the calibration model had qualified predictive accuracy. The results demonstrated that the 2DCOS with PLS was a simple and rapid method for the quantitative determination of polyphenols in oats.  相似文献   

13.
近红外光谱;径向基神经网络;吡嗪酰胺;定量分析  相似文献   

14.
Modeling quantitative structure-activity relationships (QSAR) is considered with an emphasis on prediction. An abundance of methods are available to develop such models. Using a harmonious approach that balances the bias and variance of predictions, the best calibration models are identified relative to the bias and variance criteria used. Criteria utilized to determine the adequacy of models are the root mean square error of calibration (RMSEC) and validation (RMSEV), respective R2 values, and the norm of the regression vector. QSAR data from the literature are used to demonstrate concepts. For these data sets and criteria used, it is suggested that models obtained by ridge regression (RR) are more harmonious and parsimonious than models obtained by partial least squares (PLS) and principal component regression (PCR) when the data is mean-centered. The most harmonious RR models have the best bias/variance tradeoff, reflected by the smallest RMSEC, RMSEV, and regression vector norms and the largest calibration and validation R2 values. The most parsimonious RR models have the smallest effective rank.  相似文献   

15.
This paper presents the methodology of a very sensitive determination of scandium in excess of nickel by adsorptive stripping voltammetry on a mercury film electrode and PLS regression. A calibration set consisting of binary mixtures containing 5, 15, 25, 35 or 45×10?9 M Sc(III) and simultaneously 0.5–50×10?7 M of Ni(II) was used to develop the chemometric PLS calibrations. An external set containing synthetic mixtures of 10, 20, 30, 40×10?9 M Sc(III) and the same Ni(II) concentration as mentioned above was used to validate the model and evaluate predictive ability. The application of data pretreatment techniques involving baseline correction, smoothing, range‐scaling, mean‐centering and their influence on the PLS model complexity, were also investigated. In the effect, the model for Sc(III), including 6 latent variables, was constructed. The model fulfills validation criteria and is characterized by a good prediction ability (majority of the prediction errors are lower than 10%). This work shows significant progress in the development of a very sensitive analytical technique for the determination of scandium in the presence of different concentrations of nickel by application of multivariate calibration tools.  相似文献   

16.
The aim of this study was to investigate the potential use of a direct headspace-mass spectrometry electronic nose instrument (MS e_nose) combined with chemometrics as rapid, objective and low cost technique to measure aroma properties in Australian Riesling wines. Commercial bottled Riesling wines were analyzed using a MS e_nose instrument and by a sensory panel. The MS e_nose data generated were analyzed using principal components analysis (PCA) and partial least squares (PLS1) regression using full cross validation (leave one out method). Calibration models between MS e_nose data and aroma properties were developed using partial least squares (PLS1) regression, yielding coefficients of correlation in calibration (R) and root mean square error of cross validation of 0.75 (RMSECV: 0.85) for estery, 0.89 (RMSECV: 0.94) for perfume floral, 0.82 (RMSECV: 0.62) for lemon, 0.82 (RMSECV: 0.32) for stewed apple, 0.67 (RMSECV: 0.99) for passion fruit and 0.90 (RMSECV: 0.86) for honey, respectively. The relative benefits of using MS e_nose will provide capability for rapid screening of wines before sensory analysis. However, the basic deficiency of this technique is lack of possible identification and quantitative determination of individual compounds responsible for the different aroma notes in the wine.  相似文献   

17.
By employing the simple but effective principle ‘survival of the fittest’ on which Darwin's Evolution Theory is based, a novel strategy for selecting an optimal combination of key wavelengths of multi-component spectral data, named competitive adaptive reweighted sampling (CARS), is developed. Key wavelengths are defined as the wavelengths with large absolute coefficients in a multivariate linear regression model, such as partial least squares (PLS). In the present work, the absolute values of regression coefficients of PLS model are used as an index for evaluating the importance of each wavelength. Then, based on the importance level of each wavelength, CARS sequentially selects N subsets of wavelengths from N Monte Carlo (MC) sampling runs in an iterative and competitive manner. In each sampling run, a fixed ratio (e.g. 80%) of samples is first randomly selected to establish a calibration model. Next, based on the regression coefficients, a two-step procedure including exponentially decreasing function (EDF) based enforced wavelength selection and adaptive reweighted sampling (ARS) based competitive wavelength selection is adopted to select the key wavelengths. Finally, cross validation (CV) is applied to choose the subset with the lowest root mean square error of CV (RMSECV). The performance of the proposed procedure is evaluated using one simulated dataset together with one near infrared dataset of two properties. The results reveal an outstanding characteristic of CARS that it can usually locate an optimal combination of some key wavelengths which are interpretable to the chemical property of interest. Additionally, our study shows that better prediction is obtained by CARS when compared to full spectrum PLS modeling, Monte Carlo uninformative variable elimination (MC-UVE) and moving window partial least squares regression (MWPLSR).  相似文献   

18.
Outlier detection is crucial in building a highly predictive model. In this study, we proposed an enhanced Monte Carlo outlier detection method by establishing cross‐prediction models based on determinate normal samples and analyzing the distribution of prediction errors individually for dubious samples. One simulated and three real datasets were used to illustrate and validate the performance of our method, and the results indicated that this method outperformed Monte Carlo outlier detection in outlier diagnosis. After these outliers were removed, the value of validation by Kovats retention indices and the root mean square error of prediction decreased from 3.195 to 1.655, and the average cross‐validation prediction error decreased from 2.0341 to 1.2780. This method helps establish a good model by eliminating outliers. © 2015 Wiley Periodicals, Inc.  相似文献   

19.
Fourier transform Raman spectroscopy and chemometric tools have been used for exploratory analysis of pure corn and cassava starch samples and mixtures of both starches, as well as for the quantification of amylose content in corn and cassava starch samples. The exploratory analysis using principal component analysis shows that two natural groups of similar samples can be obtained, according to the amylose content, and consequently the botanical origins. The Raman band at 480 cm?1, assigned to the ring vibration of starches, has the major contribution to the separation of the corn and cassava starch samples. This region was used as a marker to identify the presence of starch in different samples, as well as to characterize amylose and amylopectin. Two calibration models were developed based on partial least squares regression involving pure corn and cassava, and a third model with both starch samples was also built; the results were compared with the results of the standard colorimetric method. The samples were separated into two groups of calibration and validation by employing the Kennard-Stone algorithm and the optimum number of latent variables was chosen by the root mean square error of cross-validation obtained from the calibration set by internal validation (leave one out). The performance of each model was evaluated by the root mean square errors of calibration and prediction, and the results obtained indicate that Fourier transform Raman spectroscopy can be used for rapid determination of apparent amylose in starch samples with prediction errors similar to those of the standard method.
Figure
Raman spectroscopy has been successfully applied to the determination of the amylose content in cassava and corn starches by means of multivariate calibration analysis.  相似文献   

20.
应用异烟肼片粉末的近红外漫反射光谱数据分别结合偏最小二乘法(PLS)和径向基神经网络(RBFNN)建立定量分析模型,并用所建模型对预测集样品进行了预测,结果表明:应用RBFNN所建立的定量分析模型优于PLS模型,相关系数(r)值由0.99593提高到0.99734,交互验证均方根误差(RMSECV)值由0.00523下降到0.00423,预测均方根误差(RMSEP)值由0.00614下降到0.00501。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号