共查询到20条相似文献,搜索用时 156 毫秒
1.
In an spectroscopic context, when a calibration model based on partial least squares is developed to predict a response, it is often the case that a high percentage of variation in the data explained by the first latent variable is not accompanied by an equally high percentage of variation in the studied response. The addition of more components can slowly improve the calibration model, but with negative effects on the robustness and interpretability of the final model. To solve this problem, several pre-processing methods have been proposed to remove only a portion unrelated to the studied response from the spectral matrix.Moreover, the need for efficient compression methods is increasingly important due to the large size of the data currently collected. In this sense, discrete wavelet transform has proven that it can achieve good compression without losing relevant information when used on individual signals.This paper introduces a new pre-processing method, orthogonal wavelet correction (OWAVEC) that tries to lump together two important needs in multivariate calibration: signal correction and compression. The new method has been tested on a set of diesel fuels using viscosity as variable response, and its results have been compared not only with those obtained from original data but also with those provided by other correction methods. The first practical results are encouraging, as the method generates considerably better calibration models compared to the model developed from raw data and provides results as least so good as other orthogonal correction methods. 相似文献
2.
Ling Gao Shouxin Ren 《Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy》2009,73(5):960-965
A novel method named OSC-WPT-PLS approach based on partial least squares (PLS) regression with orthogonal signal correction (OSC) and wavelet packet transform (WPT) as pre-processed tools was proposed for the simultaneous spectrophotometric determination of Al(III), Mn(II) and Co(II). This method combines the ideas of OSC and WPT with PLS regression for enhancing the ability of extracting characteristic information and the quality of regression. OSC is used to remove information in the response matrix D by subtracting the structured noise that is orthogonal to the concentration matrix C. Wavelet packet transform was applied to perform data compression, to extract relevant information, and to eliminate noise and collinearity. PLS was applied for multivariate calibration and noise reduction by eliminating the less important latent variables. In this case, using trials, the kind of wavelet function, the decomposition level, the number of OSC components and the number of PLS factors for the OSC-WPT-PLS method were selected as Daubechies 4, 3, 2 and 3, respectively. A program (POSCWPTPLS) was designed to perform the simultaneous spectrophotometric determination of Al(III), Mn(II) and Co(II). The relative standard errors of prediction (RSEP) obtained for total elements using OSC-WPT-PLS, WPT-PLS and PLS were compared. Experimental results demonstrated that the OSC-WPT-PLS method had the best performance among the three methods and was successful even when there was severe overlap of spectra. 相似文献
3.
This paper reports the utilization of short-wave near-infrared (SW-NIR) transmission spectroscopy for rapid and conclusive analysis of alcoholic content (% v/v) in beverages. This spectral region is interesting because common visible diode array spectrometers can be utilized, reducing time and costs in comparison with traditional near-infrared or mid-infrared instruments. A correction of external temperature influence is necessary, and for this purposes two calibration transfer procedures were compared: piecewise direct standardization (PDS) and orthogonal signal correction (OSC). The RMSEP found for the alcoholic content model at 20 °C was 0.13% v/v and, after application of transfer calibration procedures at other temperatures (15, 25, 30 and 35 °C) using the model built at 20 °C, errors of the same order of magnitude were obtained. 相似文献
4.
Influence of data pre-processing on the quantitative determination of the ash content and lipids in roasted coffee by near infrared spectroscopy 总被引:1,自引:0,他引:1
《Analytica chimica acta》2004,509(2):217-227
In near-infrared (NIR) measurements, some physical features of the sample can be responsible for effects like light scattering, which lead to systematic variations unrelated to the studied responses. These errors can disturb the robustness and reliability of multivariate calibration models. Several mathematical treatments are usually applied to remove systematic noise in data, being the most common derivation, standard normal variate (SNV) and multiplicative scatter correction (MSC). New mathematical treatments, such as orthogonal signal correction (OSC) and direct orthogonal signal correction (DOSC), have been developed to minimize the variability unrelated to the response in spectral data. In this work, these two new pre-processing methods were applied to a set of roasted coffee NIR spectra. A separate calibration model was developed to quantify the ash content and lipids in roasted coffee samples by PLS regression. The results provided by these correction methods were compared to those obtained with the original data and the data corrected by derivation, SNV and MSC. For both responses, OSC and DOSC treatments gave PLS calibration models with improved prediction abilities (4.9 and 3.3% RMSEP with corrected data versus 7.1 and 8.3% RMSEP with original data, respectively). 相似文献
5.
Near-infrared (NIR) spectrometry is now widely used in various fields and great attention is paid to the application of it to addressing complex problems, which brings about the need for the calibration of systems that fail to exhibit satisfactional linear relationship between input-output data. In this work we present a novel method to build a multivariate calibration model for NIR spectra, i.e. genetic algorithm-radial basis function network in wavelet domain (WT-GA-RBFN), which combines the advantages of wavelet transform and genetic algorithm. The variable selection is accomplished in two stages in wavelet domain: at the first stage, the variables are pre-selected (compressed) by variance and at the second stage the variables are further reduced by a special designed GA. The proposed method is illustrated through presenting its application to three NIR data sets in different fields and the comparison to PLS model. 相似文献
6.
Orthogonal signal correction, wavelet analysis, and multivariate calibration of complicated process fluorescence data 总被引:2,自引:0,他引:2
Lennart Eriksson Johan Trygg Erik Johansson Rasmus Bro Svante Wold 《Analytica chimica acta》2000,420(2):625-195
In this paper, multivariate calibration of complicated process fluorescence data is presented. Two data sets related to the production of white sugar are investigated. The first data set comprises 106 observations and 571 spectral variables, and the second data set 268 observations and 3997 spectral variables. In both applications, a single response, ash content, is modelled and predicted as a function of the spectral variables. Both data sets contain certain features making multivariate calibration efforts non-trivial. The objective is to show how principal component analysis (PCA) and partial least squares (PLS) regression can be used to overview the data sets and to establish predictively sound regression models. It is shown how a recently developed technique for signal filtering, orthogonal signal correction (OSC), can be applied in multivariate calibration to enhance predictive power. In addition, signal compression is tested on the larger data set using wavelet analysis. It is demonstrated that a compression down to 4% of the original matrix size — in the variable direction — is possible without loss of predictive power. It is concluded that the combination of OSC for pre-processing and wavelet analysis for compression of spectral data is promising for future use. 相似文献
7.
A Background and noise elimination method for quantitative calibration of near infrared spectra 总被引:1,自引:0,他引:1
Da Chen 《Analytica chimica acta》2004,511(1):37-45
A new hybrid algorithm is proposed to eliminate the varying background and noise simultaneously for multivariate calibration of near infrared (NIR) spectral signals. The method is based on the use of multi-resolution, which is one of the main advantages provided by wavelet transform. The signals are firstly split into different frequency components, which keep the same data points of the original signals. In conjunction with a modified uninformative variable elimination (mUVE) criterion, the new method can be used to remove the low-frequency varying background and the high-frequency noise simultaneously. The method is successfully applied to simulated spectral data set and experimental NIR spectral data, resulting in more parsimonious multivariate models with higher precision. In addition, the proposed strategy can be applied to other spectral signals as well. 相似文献
8.
Zhimei Wang 《Microchemical Journal》2008,89(1):52-57
In the construction of a neural network, most attentions have been paid to the selection of the architecture, the selection of the learning parameters and the network validation while the selection of input variables shared little. This study focused on the selection of input variables by various data pre-treatment for constructing ANN models. The results showed that the validation results differed from each other when different data-pretreatment methods combined with near-infrared spectroscopy (NIRS) to build a model using artificial neural network (ANN) for quality control of paracetamol in coldrex. And wavelet coefficients after orthogonal signal correction (OSC) in the ANN models reduced RMSEP by up to 77% compared to ANN models using derivatives combined with PCA pretreatment. The selection of input variables has potent to improve the calibration ability of ANN, and the model can be used for pressure reduction of quality control in the pharmaceutical industry. 相似文献
9.
Calibration model transfer is essential for practical applications of near infrared (NIR) spectroscopy because the measurements of the spectra may be performed on different instruments and the difference between the instruments must be corrected. An approach for calibration transfer based on alternating trilinear decomposition (ATLD) algorithm is proposed in this work. From the three-way spectral matrix measured on different instruments, the relative intensity of concentration, spectrum and instrument is obtained using trilinear decomposition. Because the relative intensity of instrument is a reflection of the spectral difference between instruments, the spectra measured on different instruments can be standardized by a correction of the coefficients in the relative intensity. Two NIR datasets of corn and tobacco leaf samples measured with three instruments are used to test the performance of the method. The results show that, for both the datasets, the spectra measured on one instrument can be correctly predicted using the partial least squares (PLS) models built with the spectra measured on the other instruments. 相似文献
10.
《印度化学会志》2023,100(1):100814
In spectrophotometry, mixtures of chemical constituents cannot be determined simultaneously due to spectral interferences as well as the close λmax wavelength, the wavelength at which a substance absorbs the most photons. Since the spectra of individual components in a ternary mixture overlap, determining the concentration of individual components using the wavelength of maximum absorbance, λmax, can lead to a significant error. In this paper, the concentrations of individual components in ternary synthetic mixtures of nitrophenol, aniline, and phenol were estimated simultaneously using a model based on a genetic algorithm and partial least squares. The spectrophotometric data of ternary mixtures with almost identical spectra of nitrobenzene, aniline, and phenol were calibrated using partial least squares modeling without losing prediction capability, and a genetic algorithm method was used to select the appropriate wavelengths for partial least square calibration. The experimental calibration matrix of 27 samples containing a ternary mixture of nitrobenzene (1.0–20.0 mg L?1), aniline (1.0–15.0 mg L?1), and phenol (4.0–18.0 mg L?1) was designed by measuring the absorbance between 200 and 340 nm at a 1 nm wavelength intervals. The model was verified by using six different mixtures with varying concentrations of nitrobenzene, aniline, and phenol. The root mean square error in the prediction of nitrobenzene, aniline, and phenol was 0.1411, 0.1670, and 0.2861 with the genetic algorithm, and 0.3666, 0.6149, and 0.6279 without the genetic algorithm, respectively. This method can be successfully applied to estimate the components in synthetic mixtures accurately. Since this method is accurate and robust, it can be applied to actual industrial wastewater that contains a mixture of toxic chemicals. This eliminates the complications and costs related to separation and purification prior to the analysis using costly chromatographic methods. 相似文献
11.
New approach for chemometrics algorithm named region orthogonal signal correction (ROSC) has been introduced to improve the predictive ability of PLS models for biomedical components in blood serum developed from their NIR spectra in the 1280-1849 nm region. Firstly, a moving window partial least squares regression (MWPLSR) method was employed to locate the region due to water as a region of interference signals and to find the informative regions of glucose, albumin, cholesterol and triglyceride from NIR spectra of bovine serum samples. Next, a novel chemometrics method named searching combination moving window partial least squares (SCMWPLS) was used to optimize those informative regions. Then, the specific regions that contained the information of water, glucose, albumin, cholesterol and triglyceride were obtained. When an interested component in the bovine serum solution, such as glucose, albumin, cholesterol or triglyceride is being an analyte, the other three interests and water are considered as the interference factors. Thus, new approach for ROSC has employed for each specific region of interference signal to calculate the orthogonal components to the concentrations of analyte that were removed specifically from the NIR spectra of bovine serum in the region of 1280-1849 nm and the highest interference signal for model of analyte will be revealed. The comparison of PLS results for glucose, albumin, cholesterol and triglyceride built by using the whole region of original spectra and those developed by using the optimized regions suggested by SCMWPLS of original spectra, spectra treated OSC for orthogonal components of 1-3 and spectra treated ROSC using selected removing the highest interference signals from the spectra for orthogonal components of 1-3 are reported. It has been found that new approach of ROSC to remove the highest interference signal located by SCMWPLS improves of the performance of PLS modeling, yielding the lower RMSECV and smaller number of PLS factors. 相似文献
12.
《Analytica chimica acta》2004,514(1):57-67
Two orthogonal signal correction methods (OSC and DOSC) were applied on a set of 83 roasted coffee NIR spectra from varied origins and varieties in order to remove information unrelated to a specific chemical response (caffeine), which was selected due to its high discriminant ability to differentiate between arabica and robusta coffee varieties. These corrected NIR spectra, as well as raw NIR spectra and three chemical quantities (caffeine, chlorogenic acids and total acidity), were used to develop separate classification models accordingly using the potential functions method as a class-modelling technique in order to evaluate their respective capacities to discriminate between coffee varieties and the influence of these pre-processing methods on the classification of the coffee samples into their corresponding variety class. The transformation of roasted coffee NIR spectra by means of an orthogonal signal correction method, taking into account in this correction a chemical response closely related to the sample origin, prompted a notable improvement in the specificity of the constructed classification models. 相似文献
13.
Selectivity is one of the main challenges of sensors, particularly those based on chemical interactions. Multivariate analytical models can determine the concentration of analytes even in the presence of other potential interferences. In this work, we have determined the presence of mercury ions in aqueous solutions in the ppm range (0-2 mg L−1) using a ruthenium bis-thiocyanate complex as a chemical probe. Moreover, we have analyzed the mercury-containing solutions with the co-existence of higher concentrations (19.5 mg L−1) of other potential competitors such as Cd2+, Pb2+, Cu2+ and Zn2+ ions. Our experimental model is based on partial least squares (PLS) method and other techniques as genetic algorithm and statistical feature selection (SFS) that have been used to refine, beforehand, the analytical data. In summary, we have demonstrated that the root mean square error of prediction without pre-treatment and with statistical feature selection can be reduced from 10.22% to 6.27%. 相似文献
14.
Riccardo LeardiRandy J. Pell 《Analytica chimica acta》2002,461(2):189-200
Variable selection using a genetic algorithm is combined with partial least squares (PLS) for the prediction of additive concentrations in polymer films using Fourier transform-infrared (FT-IR) spectral data. An approach using an iterative application of the genetic algorithm is proposed. This approach allows for all variables to be considered and at the same time minimizes the risk of overfitting. We demonstrate that the variables selected by the genetic algorithm are consistent with expert knowledge. This very exciting result is a convincing application that the algorithm can select correct variables in an automated fashion. 相似文献
15.
To date, few efforts have been made to take simultaneous advantage of the local nature of spectral data in both the time and frequency domains in a single regression model. We describe here the use of a novel chemometrics algorithm using the wavelet transform. We call the algorithm dual-domain regression, as the regression step defines a weighted model in the time-domain based on the contributions of parallel, frequency-domain models made from wavelet coefficients reflecting different scales. In principle, any regression method can be used, and implementation of the algorithm using partial least squares regression and principal component regression are reported here. The performance of the models produced from the algorithm is generally superior to that of regular partial least squares (PLS) or principal component regression (PCR) models applied to data restricted to a single domain. Dual-domain PLS and PCR algorithms are applied to near infrared (NIR) spectral datasets of Cargill corn samples and sets of spectra collected on batch chemical reactions run in different reactors to illustrate the improved robustness of the modeling. 相似文献
16.
In the present work, we explored the possibility of using near-infrared spectroscopy in order to quantify the degree of adulteration of durum wheat flour with common bread wheat flour. The multivariate calibration techniques adopted to this aim were PLS and a wavelet-based calibration algorithm, recently developed by some of us, called WILMA. Both techniques provided satisfactory results, the percentage of adulterant present in the samples being quantified with an uncertainty lower than that associated to the Italian official method. In particular the WILMA algorithm, by performing feature selection, allowed the signal pretreatment to be avoided and obtaining more parsimonious models. 相似文献
17.
A new hybrid algorithm is proposed to eliminate the interference information for multivariate calibration of near-infrared (NIR) spectra that includes noise, background and systemic spectral variation irrelevant to concentration. The method consists of two parts: approximate derivative based on continuous wavelet transform (CWT) and orthogonal signal correction (OSC). After the approximate derivative calculated by CWT, OSC was performed. It was successfully applied to real complex NIR spectral data to eliminate the interference information. Correction for the interference of NIR spectra resulted in a substantial improvement in the predicted precision, and a more concise calibration model was obtained. The proposed procedure also compared favourably with several pretreatment methods, and the new method appears to provide a high-performance pretreatment tool for multivariate calibration of NIR spectra. In addition, the strategy proposed here can be applied to various other spectral data for quantitative purposes as well. 相似文献
18.
19.
The non-linear regression technique known as alternating conditional expectations (ACE) method is only applicable when the number of objects available for calibration is considerably greater than the number of considered predictors. Alternating conditional expectations regression with selection of significant predictors by genetic algorithms (GA-ACE), the non-linear regression technique presented here, is based on the ACE algorithm but introducing several modifications to resolve the applicability limitations of the original ACE method, thus facilitating the practical implementation of a very interesting calibration tool. In order to overcome the lack of reliability displayed by the original ACE algorithm when working on data sets characterized by a too large number of variables and prior to the development of the non-linear regression model, GA-ACE applies genetic algorithms as a variable selection technique to select a reduced subset of significant predictors able to accurately model and predict a considered variable response. Furthermore, GA-ACE actually provides two alternative application approaches, since it allows either the performance of prior data compression computing a number of principal components to be subsequently subjected to GA-selection, or working directly on original variables.In this study, GA-ACE was applied to two real calibration problems, with a very low observation/variable ratio (NIR data), and the results were compared with those obtained by several linear regression techniques usually employed. When using the GA-ACE non-linear method, notably improved regression models were developed for the two response variables modeled, with root mean square errors of the residuals in external prediction (RMSEP) equal to 11.51 and 6.03% for moisture and lipid contents of roasted coffee samples, respectively. The improvement achieved by applying the new non-linear method introduced is even more remarkable taking into account the results obtained with the best performance linear method (IPW-PLS) applied to predict the studied responses (14.61 and 7.74% RMSEP, respectively). 相似文献
20.
Direct orthogonal signal correction (DOSC) is applied to correct for major variance sources such as temperature effects, time influences and instrumental differences in near infrared (NIR) data. The samples analysed are creams containing different concentrations of an active drug. The final aim is to classify the samples according to their concentration of active compound. Having performed DOSC on the data, it is not necessary anymore to apply sophisticated chemometric techniques to correct for temperature or time effects and to attribute the samples to their respective concentration classes. Moreover, the application of DOSC on the NIR spectra recorded on two different instruments shows that this method can be considered as a valuable alternative for the standardisation in classification applications. Since the applied algorithm tends to overfit, in a second part of this paper, a comparison is made with an algorithm designed by Westerhuis, which should overcome this problem. Although the calibration set results show that the overfitting has been partially corrected for by the latter algorithm, the test set results did not improve significantly. 相似文献