首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Orthogonal WAVElet correction (OWAVEC) is a pre-processing method aimed at simultaneously accomplishing two essential needs in multivariate calibration, signal correction and data compression, by combining the application of an orthogonal signal correction algorithm to remove information unrelated to a certain response with the great potential that wavelet analysis has shown for signal processing. In the previous version of the OWAVEC method, once the wavelet coefficients matrix had been computed from NIR spectra and deflated from irrelevant information in the orthogonalization step, effective data compression was achieved by selecting those largest correlation/variance wavelet coefficients serving as the basis for the development of a reliable regression model. This paper presents an evolution of the OWAVEC method, maintaining the first two stages in its application procedure (wavelet signal decomposition and direct orthogonalization) intact but incorporating genetic algorithms as a wavelet coefficients selection method to perform data compression and to improve the quality of the regression models developed later. Several specific applications dealing with diverse NIR regression problems are analyzed to evaluate the actual performance of the new OWAVEC method. Results provided by OWAVEC are also compared with those obtained with original data and with other orthogonal signal correction methods.  相似文献   

2.
The wavelet packet transform (WPT) is a variant of the standard wavelet transform that offers greater flexibility in the decomposition of instrumental signals. Although encouraging results have been published concerning the use of WPT for signal compression and denoising, its application in multivariate calibration problems has received comparatively little attention, with very few contributions reported in the literature. This paper presents an investigation concerning the use of WPT as a feature extraction tool to improve the prediction ability of PLS models. The optimization of the wavelet packet tree is accomplished by using the classic dynamic programming algorithm and an entropy cost function modified to take into account the variance explained by the WPT coefficients. The selection of WPT coefficients for inclusion in the PLS model is carried out on the basis of correlation with the dependent variable, in order to exploit the joint statistics of the instrumental response and the parameter of interest. This WPT-PLS strategy is applied in a case study involving FT-IR spectrometric determination of four gasoline parameters, namely specific mass (SM) and the distillation temperatures at which 10%, 50%, 90% of the sample has evaporated. The dataset comprises 103 gasoline samples collected from gas stations and 6144 wavelengths in the range 2500-15000 nm. By applying WPT to the FT-IR spectra, considerable compression with respect to the original wavelength domain is achieved. The effect of varying the wavelet and the threshold level on the prediction ability of the resulting models is investigated. The results show that WPT-PLS outperforms standard PLS in most wavelet-threshold combinations for all determined parameters.  相似文献   

3.
《Analytical letters》2012,45(1):171-183
Based on wavelet transformation (WT) and mutual information (MI), a simple and effective procedure is proposed for multivariate calibration of near-infrared spectroscopy. In such a procedure, the original spectra of the training set are first transformed into a set of wavelet representations by wavelet prism transform. Then, the MI value between each wavelet coefficient variable and the dependent variable is calculated, resulting in a MI spectrum; by retaining a subset set of coefficients with higher MI, an update training set consisting of wavelet coefficients is obtained and reconstructed/converted back to the original domain. Based on this, a partial least square (PLS) model can be constructed and optimized. The optimal wavelet and decomposition level are determined by experiment. A NIR quantitative problem involving the determination of total sugar in tobacco is used to demonstrate the overall performance of the proposed procedure, named RPLS, meaning PLS in reconstructed original domain coupled with MI-induced variable selection in wavelet domain (RPLS). Three kinds of procedures, that is, conventional full-spectrum PLS in original domain (FPLS), PLS in original domain coupled with MI-induced variable selection (OPLS), and direct PLS in MI-based wavelet coefficients (WPLS), are used as reference. The result confirms that it can build more accurate and robust calibration models without increasing the complexity.  相似文献   

4.
We propose a new data compression method for estimating optimal latent variables in multi‐variate classification and regression problems where more than one response variable is available. The latent variables are found according to a common innovative principle combining PLS methodology and canonical correlation analysis (CCA). The suggested method is able to extract predictive information for the latent variables more effectively than ordinary PLS approaches. Only simple modifications of existing PLS and PPLS algorithms are required to adopt the proposed method. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

5.
Chen D  Hu B  Shao X  Su Q 《The Analyst》2004,129(7):664-669
Variable selection is often used to produce more robust and parsimonious regression models. But when they are applied directly to the raw near-infrared spectra, it is not easy to select appropriate variables because background and noise will often overshadow or overlap the absorption bands of analyte. In this work, a new hybrid algorithm based on the selection of the most informative variables in the continuous wavelet transform (CWT) domain is described. The strategy is a combination of CWT and a procedure of modified iterative predictor weighting-partial least square (mIPW-PLS). After elimination of the background and noise in NIR spectra by CWT, the mIPW-PLS approach is used to select the most informative CWT coefficients. With the selected CWT coefficients, a PLS model is built finally for prediction. It is indicated that the extraction of most important variables in the CWT domain can effectively avoid the interference of background and noise, and result in a high quality of regression model with a very small number of variables and fewer PLS components.  相似文献   

6.
Determination of benzo[a]pyrene (BaP) in cigarette smoke can be very important for the tobacco quality control and the assessment of its harm to human health. In this study, mid-infrared spectroscopy (MIR) coupled to chemometric algorithm (DPSO-WPT-PLS), which was based on the wavelet packet transform (WPT), discrete particle swarm optimization algorithm (DPSO) and partial least squares regression (PLS), was used to quantify harmful ingredient benzo[a]pyrene in the cigarette mainstream smoke with promising result. Furthermore, the proposed method provided better performance compared to several other chemometric models, i.e., PLS, radial basis function-based PLS (RBF-PLS), PLS with stepwise regression variable selection (Stepwise-PLS) as well as WPT-PLS with informative wavelet coefficients selected by correlation coefficient test (rtest-WPT-PLS). It can be expected that the proposed strategy could become a new effective, rapid quantitative analysis technique in analyzing the harmful ingredient BaP in cigarette mainstream smoke.  相似文献   

7.
A modified partial least squares (PLS) algorithm is presented on the basis of a novel weight updating strategy. The new weight can handle situations with directions in X space having large variance unrelated to Y , whereas the linear PLS may not work well. In the proposed algorithm, the slice transform technique is introduced to provide a piecewise linear representation of the weight vectors. Then, the corresponding mapping functions are estimated by a least square criterion of the inner relation between the observed variables and the score of response variables. At last, weight vectors are updated by the obtained mapping functions, and the corresponding scores and loadings are calculated with the new weights. An optimal piecewise linear replacements of the PLS weights are achieved by the proposed method. The predictive performances of the new approach and other methods are compared statistically using the Wilcoxon signed rank test. Experimental results show that the proposed method can achieve simpler models, whereas the model performances are at least comparable with PLS and other methods. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

8.
An algorithm is proposed for extracting relevant information from near-infrared (NIR) spectra for multivariate calibration of routine components in complex plant samples. The algorithm is a combination of wavelet transform (WT) data compression and a procedure for uninformative variable elimination (UVE). After compression of the NIR spectra by WT, the UVE approach is used to eliminate the irrelevant wavelet coefficients. Finally, a calibration model is built from the retained wavelet coefficients to enable prediction. Because irrelevant information can be removed from the spectra used for multivariate calibration, the model based on the extracted relevant features is better than those obtained with full-spectrum data. Both prediction precision and calculation speed are improved.  相似文献   

9.
A novel strategy for the optimization of wavelet transforms with respect to the statistics of the data set in multivariate calibration problems is proposed. The optimization follows a linear semi-infinite programming formulation, which does not display local maxima problems and can be reproducibly solved with modest computational effort. After the optimization, a variable selection algorithm is employed to choose a subset of wavelet coefficients with minimal collinearity. The selection allows the building of a calibration model by direct multiple linear regression on the wavelet coefficients. In an illustrative application involving the simultaneous determination of Mn, Mo, Cr, Ni, and Fe in steel samples by ICP-AES, the proposed strategy yielded more accurate predictions than PCR, PLS, and nonoptimized wavelet regression.  相似文献   

10.
A partial least squares (PLS) and wavelet transform hybrid model are proposed to analyze the carbon content of coal by using laser-induced breakdown spectroscopy (LIBS). The hybrid model is composed of two steps of wavelet analysis procedures, which include environmental denoising and background noise reduction, to pretreat the LIBS spectrum. The processed wavelet coefficients, which contain the discrete line information of the spectra, were taken as inputs for the PLS model for calibration and prediction of carbon element. A higher signal-to-noise ratio of carbon line was obtained after environmental denoising, and the best decomposition level was determined after background noise reduction. The hybrid model resulted in a significant improvement over the conventional PLS method under different ambient environments, which include air, argon, and helium. The average relative error of carbon decreased from 2.74 to 1.67% under an ambient helium environment, which indicated a significantly improved accuracy in the measurement of carbon in coal. The best results obtained under an ambient helium environment could be partly attributed to the smallest interference by noise after wavelet denoising. A similar improvement was observed in ambient air and argon environments, thereby proving the applicability of the hybrid model under different experimental conditions.  相似文献   

11.
Optimized sample-weighted partial least squares   总被引:2,自引:0,他引:2  
Lu Xu 《Talanta》2007,71(2):561-566
In ordinary multivariate calibration methods, when the calibration set is determined to build the model describing the relationship between the dependent variables and the predictor variables, each sample in the calibration set makes the same contribution to the model, where the difference of representativeness between the samples is ignored. In this paper, by introducing the concept of weighted sampling into partial least squares (PLS), a new multivariate regression method, optimized sample-weighted PLS (OSWPLS) is proposed. OSWPLS differs from PLS in that it builds a new calibration set, where each sample in the original calibration set is weighted differently to account for its representativeness to improve the prediction ability of the algorithm. A recently suggested global optimization algorithm, particle swarm optimization (PSO) algorithm is used to search for the best sample weights to optimize the calibration of the original training set and the prediction of an independent validation set. The proposed method is applied to two real data sets and compared with the results of PLS, the most significant improvement is obtained for the meat data, where the root mean squared error of prediction (RMSEP) is reduced from 3.03 to 2.35. For the fuel data, OSWPLS can also perform slightly better or no worse than PLS for the prediction of the four analytes. The stability and efficiency of OSWPLS is also studied, the results demonstrate that the proposed method can obtain desirable results within moderate PSO cycles.  相似文献   

12.
A novel method named a wavelet packet transform based Elman recurrent neural network (WPTERNN) was proposed for the simultaneous UV–visible spectrometric determination of Cu(II), Cd(II) and Zn(II). This method combined wavelet packet denoising with an Elman recurrent neural network. A wavelet packet transform was applied to perform data compression, to extract relevant information, and to eliminate noise and collinearity. An Elman recurrent network was applied for nonlinear multivariate calibration. In this case, using trials, the kind of wavelet function, the decomposition level, and the number of hidden nodes for the WPTERNN method were selected as Daubechies 14, 3, and 8, respectively. A program (PWPTERNN) was designed that could perform the simultaneous determination of Cu(II), Cd(II) and Zn(II). The relative standard errors of prediction (RSEP) obtained for all components using WPTERNN, a Elman recurrent neural network (ERNN), partial least squares (PLS), principal component regression (PCR), Fourier transform based PCR (FTPCR), and multivariate linear regression (MLR) were compared. Experimental results demonstrated that the WPTERRN method was successful even where there was severe overlap of spectra. The results obtained from an additional test case also demonstrated that the WPTERNN method performed very well. Figure The part of WP coefficients obtained by wavelet packet transforms  相似文献   

13.
A novel method named OSC-WPT-PLS approach based on partial least squares (PLS) regression with orthogonal signal correction (OSC) and wavelet packet transform (WPT) as pre-processed tools was proposed for the simultaneous spectrophotometric determination of Al(III), Mn(II) and Co(II). This method combines the ideas of OSC and WPT with PLS regression for enhancing the ability of extracting characteristic information and the quality of regression. OSC is used to remove information in the response matrix D by subtracting the structured noise that is orthogonal to the concentration matrix C. Wavelet packet transform was applied to perform data compression, to extract relevant information, and to eliminate noise and collinearity. PLS was applied for multivariate calibration and noise reduction by eliminating the less important latent variables. In this case, using trials, the kind of wavelet function, the decomposition level, the number of OSC components and the number of PLS factors for the OSC-WPT-PLS method were selected as Daubechies 4, 3, 2 and 3, respectively. A program (POSCWPTPLS) was designed to perform the simultaneous spectrophotometric determination of Al(III), Mn(II) and Co(II). The relative standard errors of prediction (RSEP) obtained for total elements using OSC-WPT-PLS, WPT-PLS and PLS were compared. Experimental results demonstrated that the OSC-WPT-PLS method had the best performance among the three methods and was successful even when there was severe overlap of spectra.  相似文献   

14.
The insight from, and conclusions of this paper motivate efficient and numerically robust ‘new’ variants of algorithms for solving the single response partial least squares regression (PLS1) problem. Prototype MATLAB code for these variants are included in the Appendix. The analysis of and conclusions regarding PLS1 modelling are based on a rich and nontrivial application of numerous key concepts from elementary linear algebra. The investigation starts with a simple analysis of the nonlinear iterative partial least squares (NIPALS) PLS1 algorithm variant computing orthonormal scores and weights. A rigorous interpretation of the squared P ‐loadings as the variable‐wise explained sum of squares is presented. We show that the orthonormal row‐subspace basis of W ‐weights can be found from a recurrence equation. Consequently, the NIPALS deflation steps of the centered predictor matrix can be replaced by a corresponding sequence of Gram–Schmidt steps that compute the orthonormal column‐subspace basis of T ‐scores from the associated non‐orthogonal scores. The transitions between the non‐orthogonal and orthonormal scores and weights (illustrated by an easy‐to‐grasp commutative diagram), respectively, are both given by QR factorizations of the non‐orthogonal matrices. The properties of singular value decomposition combined with the mappings between the alternative representations of the PLS1 ‘truncated’ X data (including P t W ) are taken to justify an invariance principle to distinguish between the PLS1 truncation alternatives. The fundamental orthogonal truncation of PLS1 is illustrated by a Lanczos bidiagonalization type of algorithm where the predictor matrix deflation is required to be different from the standard NIPALS deflation. A mathematical argument concluding the PLS1 inconsistency debate (published in 2009 in this journal) is also presented. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

15.
Kernel partial least squares (KPLS) has become popular techniques for chemical and biological modeling, which is a nonlinear extension of linear PLS. Training samples are transformed into a feature space via a nonlinear mapping, and then PLS algorithm can be carried out in the feature space. However, one of the main limitations of KPLS is that each feature is given the same importance in the kernel matrix, thus explaining the poor performance of KPLS for data with many irrelevant features. In this study, we provide a new strategy incorporated variable importance into KPLS, which is termed as the WKPLS approach. The WKPLS approach by modifying the kernel matrix provides a feasible way to differentiate between the true and noise variables. On the basis of the fact that the regression coefficients of the PLS model reflect the importance of variables, we firstly obtain the normalized regression coefficients by establishing the PLS model with all the variables. Then, Variable importance is incorporated into primary kernel. The performance of WKPLS is investigated with one simulated dataset and two structure–activity relationship (SAR) datasets. Compared with standard linear kernel PLS and Gaussian kernel PLS, The results show that WKPLS yields superior prediction performances to standard KPLS. WKPLS could be considered as a good mechanism by introducing extra information to improve the performance of KPLS for modeling SAR.  相似文献   

16.
To date, few efforts have been made to take simultaneous advantage of the local nature of spectral data in both the time and frequency domains in a single regression model. We describe here the use of a novel chemometrics algorithm using the wavelet transform. We call the algorithm dual-domain regression, as the regression step defines a weighted model in the time-domain based on the contributions of parallel, frequency-domain models made from wavelet coefficients reflecting different scales. In principle, any regression method can be used, and implementation of the algorithm using partial least squares regression and principal component regression are reported here. The performance of the models produced from the algorithm is generally superior to that of regular partial least squares (PLS) or principal component regression (PCR) models applied to data restricted to a single domain. Dual-domain PLS and PCR algorithms are applied to near infrared (NIR) spectral datasets of Cargill corn samples and sets of spectra collected on batch chemical reactions run in different reactors to illustrate the improved robustness of the modeling.  相似文献   

17.
In spectroscopy the measured spectra are typically plotted as a function of the wavelength (or wavenumber), but analysed with multivariate data analysis techniques (multiple linear regression (MLR), principal components regression (PCR), partial least squares (PLS)) which consider the spectrum as a set of m different variables. From a physical point of view it could be more informative to describe the spectrum as a function rather than as a set of points, hereby taking into account the physical background of the spectrum, being a sum of absorption peaks for the different chemical components, where the absorbance at two wavelengths close to each other is highly correlated. In a first part of this contribution, a motivating example for this functional approach is given. In a second part, the potential of functional data analysis is discussed in the field of chemometrics and compared to the ubiquitous PLS regression technique using two practical data sets. It is shown that for spectral data, the use of B-splines proves to be an appealing basis to accurately describe the data. By applying both functional data analysis and PLS on the data sets the predictive ability of functional data analysis is found to be comparable to that of PLS. Moreover, many chemometric datasets have some specific structure (e.g. replicate measurements, on the same object or objects that are grouped), but the structure is often removed before analysis (e.g. by averaging the replicates). In the second part of this contribution, we suggest a method to adapt traditional analysis of variance (ANOVA) methods to datasets with spectroscopic data. In particular, the possibilities to explore and interpret sources of variation, such as variations in sample and ambient temperature, are examined. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

18.
使用金纳米粒子为增强因子的表面增强拉曼光谱技术,通过连续小波变换将拉曼光谱信号转化到小波空间(墨西哥帽小波作为小波基)。该步骤能够减轻信号中基线变化及随机噪音的影响并找到峰位置和最佳小波尺度系数。依据小波空间中的信息,对混合物光谱及标准谱光谱进行反向搜索得到反向搜索匹配系数(Reverse match quality,RMQ),作为判断混合物中目标成分是否存在的依据。该算法可对混合物中的目标物质进行准确定性,并已成功应用于多种食品中色素鉴定。食品中色素的检出率达到99%,且结果稳健,其效果明显优于传统的命中质量系数法(Hit quality index,HQI)。这证实了小波空间反向搜索方法是一种快速而准确的拉曼光谱定性算法。  相似文献   

19.
《Analytical letters》2012,45(13):2189-2206
Abstract

In the study of voltammetric electronic tongues, a key point is the preprocessing of the departure information, the voltammograms which form the response of the sensor array, prior to classification or modeling with advanced chemometric tools. This work demonstrates the use of the discrete wavelet transform (DWT) for compacting these voltammograms prior to modeling. After compression, a system based on artificial neural networks (ANNs) was used for the quantification of the electroactive substances present, using the obtained wavelet decomposition coefficients as their inputs. The Daubechies wavelet of fourth order permitted an effective compression up to 16 coefficients, reducing the original dimension by ca. 10 times. The case studied is a mixture of three oxidizable amino acids:tryptophan, cysteine, and tyrosine. With the reduced information, one ANN per specie was trained using the Bayesian regularization algorithm. The proposed procedure was compared with the more conventional treatments of downsampling the voltammogram, or its feature extraction employing principal component analysis prior to ANNs.  相似文献   

20.
Regression from high dimensional observation vectors is particularly difficult when training data is limited. Partial least squares (PLS) partly solves the high dimensional regression problem by projecting the data to latent variables space. The key issue in PLS is the computation of weight vector which describes the covariance between the responses and observations. For small-sample-size and high-dimensional regression problem, the covariance estimation is usually inaccurate and the correlated components in the predictors will distort the PLS weight. In this paper, we propose a sparse matrix transform (SMT) based PLS (SMT-PLS) method for high-dimensional spectroscopy regression. In SMT-PLS, the observation data is first decorrelated by SMT. Then, in the decorrelated data space, the PLS loading weight is computed by least squares regression. SMT technique provides an accurate data covariance estimation, which can overcome the effect of small-sample-size and benefit both the PLS weight computation and subsequent regression prediction. The proposed SMT-PLS method is compared, in terms of root mean square errors of prediction, to PLS, Power PLS and PLS with orthogonal scatter correction on four real spectroscopic data sets. Experimental results demonstrate the efficacy and effectiveness of our proposed method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号