首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 758 毫秒
1.
This paper is about how to incorporate interaction effects in multi‐block methodologies. The method proposed is inspired by polynomial regression modelling in the case with only a few independent variables but extends/generalises the idea to situations where the blocks are potentially very large with respect to the number of variables. The method follows a so‐called type I sums of squares strategy where the linear effects (main effects) are incorporated sequentially and before the interactions. The sequential and orthogonalised partial least squares (SO‐PLS) technique is used as a basis for the proposal. The SO‐PLS method is based on sequential estimation of each new block by the PLS regression method after orthogonalisation with respect to blocks already fitted. The new method preserves the invariance already established for SO‐PLS and can be used for blocks with different dimensionality. The method is tested on one real data set with two independent blocks with different complexity and on a simulated data set with a large number of variables in each block. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

2.
Advances in sensory systems have led to many industrial applications with large amounts of highly correlated data, particularly in chemical and pharmaceutical processes. With these correlated data sets, it becomes important to consider advanced modeling approaches built to deal with correlated inputs in order to understand the underlying sources of variability and how this variability will affect the final quality of the product. Additional to the correlated nature of the data sets, it is also common to find missing elements and noise in these data matrices. Latent variable regression methods such as partial least squares or projection to latent structures (PLS) have gained much attention in industry for their ability to handle ill‐conditioned matrices with missing elements. This feature of the PLS method is accomplished through the nonlinear iterative PLS (NIPALS) algorithm, with a simple modification to consider the missing data. Moreover, in expectation maximization PLS (EM‐PLS), imputed values are provided for missing data elements as initial estimates, conventional PLS is then applied to update these elements, and the process iterates to convergence. This study is the extension of previous work for principal component analysis (PCA), where we introduced nonlinear programming (NLP) as a means to estimate the parameters of the PCA model. Here, we focus on the parameters of a PLS model. As an alternative to modified NIPALS and EM‐PLS, this paper presents an efficient NLP‐based technique to find model parameters for PLS, where the desired properties of the parameters can be explicitly posed as constraints in the optimization problem of the proposed algorithm. We also present a number of simulation studies, where we compare effectiveness of the proposed algorithm with competing algorithms. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

3.
The on‐line monitoring of batch processes based on principal component analysis (PCA) has been widely studied. Nonetheless, researchers have not paid so much attention to the on‐line application of partial least squares (PLS). In this paper, the influence of several issues in the predictive power of a PLS model for the on‐line estimation of key variables in a batch process is studied. Some of the conclusions can help to better understand the capabilities of the proposals presented for on‐line PCA‐based monitoring. Issues like the convenience of batch‐wise or variable‐wise unfolding, the method for the imputation of future measurements and the use of several sub‐models are addressed. This is the first time that the adaptive hierarchical (or multi‐block) approach is extended to the PLS modelling. Also, the formulation of the so‐called trimmed scores regression (TSR), a powerful imputation method defined for PCA, is extended for its application with PLS modelling. Data from two processes, one simulated and one real, are used to illustrate the results. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

4.
Partial Least Squares (PLS) is a wide class of regression methods aiming at modelling relationships between sets of observed variables by means of latent variables. Specifically, PLS2 was developed to correlate two blocks of data, the X‐block representing the independent or explanatory variables and the Y‐block representing the dependent or response variables. Lately, OPLS was introduced to further reduce model complexity by removing Y‐orthogonal sources of variation from X in the latent space, thus improving data interpretation through the generated predictive latent variables. Nevertheless, relationships between PLS2 and OPLS in case of multiple Y‐response have not yet been fully explored. With this perspective and taking inspiration from some basic mathematical properties of PLS2, we here present a novel and general approach consisting in a post‐transformation of PLS2 (ptPLS2), which results in a decomposition of the latent space into orthogonal and predictive components, while preserving the same goodness of fit and predictive ability of PLS2. Additionally, we discuss the application of ptPLS2 approach to two metabolomic data sets extracted from earlier published studies and its advantages in model interpretation as compared with the ‘standard’ PLS approach. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

5.
The combination of unfolded partial least‐squares (U‐PLS) with residual bilinearization (RBL) provides a second‐order multivariate calibration method capable of achieving the second‐order advantage. RBL is performed by varying the test sample scores in order to minimize the residues of a combined U‐PLS model for the calibrated components and a principal component model for the potential interferents. The sample scores are then employed to predict the analyte concentration, with regression coefficients taken from the calibration step. When the contribution of multiple potential interferents is severe, particle swarm optimization (PSO) helps in preventing RBL to be trapped by false minima, restoring its predictive ability and making it comparable to the standard parallel factor (PARAFAC) analysis. Both simulated and experimental systems are analyzed in order to show the potentiality of the new technique. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

6.
Extension of standard regression to the case of multiple regressor arrays is given via the Kronecker product. The method is illustrated using ordinary least squares regression (OLS) as well as the latent variable (LV) methods principal component regression (PCR) and partial least squares regression (PLS). Denoting the method applied to PLS as mrPLS, the latter was shown to explain as much or more variance for the first LV relative to the comparable L‐partial least squares regression (L‐PLS) model. The same relationship holds when mrPLS is compared to PLS or n‐way partial least squares (N‐PLS) and the response array is 2‐way or 3‐way, respectively, where the regressor array corresponding to the first mode of the response array is 2‐way and the second mode regressor array is an identity matrix. In a comparison with N‐PLS using fragrance data, mrPLS proved superior in a validation sense when model selection was used. Though the focus is on 2‐way regressor arrays, the method can be applied to n‐way regressors via N‐PLS. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

7.
The issue of outer model weight updating is important in extending partial least squares (PLS) regression to modelling data that shows significant non‐linearity. This paper presents a novel co‐evolutionary component approach to the weight updating problem. Specification of the non‐linear PLS model is achieved using an evolutionary computational (EC) method that can co‐evolve all non‐linear inner models and all input projection weights simultaneously. In this method, modular symbolic non‐linear equations are used to represent the inner models and binary sequences are used to represent the projection weights. The approach is flexible, and other representations could be employed within the same co‐evolutionary framework. The potential of these methods is illustrated using a simulated pH neutralisation process data set exhibiting significant non‐linearity. It is demonstrated that the co‐evolutionary component architecture can produce results which are competitive with non‐linear neural network‐based PLS algorithms that use iterative projection weight updating. In addition, a data sampling method for mitigating overfitting to the training data is described. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

8.
The multivariate calibration methods—partial least squares (PLS), orthogonal signal correction and partial least squares (OSC‐PLS)—were employed for the prediction of total antioxidant activities of four Prunella L. species. High‐performance liquid chromatography (HPLC) and spectrophotometric approaches were used to determine the total antioxidant activity of the Prunella L. samples. Several preprocessing techniques such as smoothing and normalization were employed to extract the chemically relevant information from the data after alignment with correlation optimized warping. The importance of the preprocessing was investigated by calculating the root mean square error for the calibration set for the total antioxidant activity of Prunella L. samples. The models developed on the basis of the preprocessed data were able to predict the total antioxidant activity with a precision comparable to that of the reference 2,2‐azino‐di‐(3‐ethylbenzothialozine‐sulfonic acid) and 2,2‐diphenyl‐1‐picrylhydrazyl methods. The OSC‐PLS model seems preferable because of its predictive and describing abilities and good interpretability of the contribution of compounds to the total antioxidant activity. The contribution of individual phenolic compounds to the total antioxidant activity was identified by HPLC. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

9.
In chemical and biochemical processes, steady‐state models are widely used for process assessment, control and optimisation. In these models, parameter adjustment requires data collected under nearly steady‐state conditions. Several approaches have been developed for steady‐state identification (SSID) in continuous processes, but no attempt has been made to adapt them to the singularities of batch processes. The main aim of this paper is to propose an automated method based on batch‐wise unfolding of the three‐way batch process data followed by a principal component analysis (Unfold‐PCA) in combination with the methodology of Brown and Rhinehart 2 for SSID. A second goal of this paper is to illustrate how by using Unfold‐PCA, process understanding can be gained from the batch‐to‐batch start‐ups and transitions data analysis. The potential of the proposed methodology is illustrated using historical data from a laboratory‐scale sequencing batch reactor (SBR) operated for enhanced biological phosphorus removal (EBPR). The results demonstrate that the proposed approach can be efficiently used to detect when the batches reach the steady‐state condition, to interpret the overall batch‐to‐batch process evolution and also to isolate the causes of changes between batches using contribution plots. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

10.
Target projection (TP) also called target rotation (TR) was introduced to facilitate interpretation of latent‐variable regression models. Orthogonal partial least squares (OPLS) regression and PLS post‐processing by similarity transform (PLS + ST) represent two alternative algorithms for the same purpose. In addition, OPLS and PLS + ST provide components to explain systematic variation in X orthogonal to the response. We show, that for the same number of components, OPLS and PLS + ST provide score and loading vectors for the predictive latent variable that are the same as for TP except for a scaling factor. Furthermore, we show how the TP approach can be extended to become a hybrid of latent‐variable (LV) regression and exploratory LV analysis and thus embrace systematic variation in X unrelated to the response. Principal component analysis (PCA) of the residual variation after removal of the target component is here used to extract the orthogonal components, but X‐tended TP (XTP) permits other criteria for decomposition of the residual variation. If PCA is used for decomposing the orthogonal variation in XTP, the variance of the major orthogonal components obtained for OPLS and XTP is observed to be almost the same, showing the close relationship between the methods. The XTP approach is tested and compared with OPLS for a three‐component mixture analyzed by infrared spectroscopy and a multicomponent mixture measured by near infrared spectroscopy in a reactor. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

11.
It is well known that the predictions of the single response orthogonal projections to latent structures (OPLS) and the single response partial least squares regression (PLS1) regression are identical in the single‐response case. The present paper presents an approach to identification of the complete y ‐orthogonal structure by starting from the viewpoint of standard PLS1 regression. Three alternative non‐deflating OPLS algorithms and a modified principal component analysis (PCA)‐driven method (including MATLAB code) is presented. The first algorithm implements a postprocessing routine of the standard PLS1 solution where QR factorization applied to a shifted version of the non‐orthogonal scores is the key to express the OPLS solution. The second algorithm finds the OPLS model directly by an iterative procedure. By a rigorous mathematical argument, we explain that orthogonal filtering is a ‘built‐in’ property of the traditional PLS1 regression coefficients. Consequently, the capabilities of OPLS with respect to improving the predictions (also for new samples) compared with PLS1 are non‐existing. The PCA‐driven method is based on the fact that truncating off one dimension from the row subspace of X results in a matrix X orth with y ‐orthogonal columns and a rank of one less than the rank of X . The desired truncation corresponds exactly to the first X deflation step of Martens non‐orthogonal PLS algorithm. The significant y ‐orthogonal structure of X found by PCA of X orth is split into two fundamental parts: one part that is significantly contributing to correct the first PLS score toward y and one part that is not. The third and final OPLS algorithm presented is a modification of Martens non‐orthogonal algorithm into an efficient dual PLS1–OPLS algorithm. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

12.
13.
Diagnostics are fundamental to multivariate calibration (MC). Two common diagnostics are leverages and spectral F‐ratios and these have been formulated for many MC methods such as partial least square (PLS), principal component regression (PCR) and classical least squares (CLS). While these are some of the most common methods of calibration in analytical chemistry, ridge regression is also common place and yet spectral F‐ratios have not been developed for it. Noting that ridge regression is a form of Tikhonov regularization (TR) and using the unifying filter factor representation for MC, this paper develops the filter factor form of leverages and spectral F‐ratios. The approach is applied to a spectral data set to demonstrate computational speed‐up advantages and ease of implementation for the filter factor representation. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

14.
邵学广  陈达  徐恒  刘智超  蔡文生 《中国化学》2009,27(7):1328-1332
偏最小二乘法(PLS)在近红外光谱(NIR)定量分析中占有重要地位,但预测结果往往容易受到样本分组和奇异样本等因素的影响,稳健性不强。多模型PLS (EPLS)方法在模型稳健性上得到提高,然而它无法识别样本中存在的奇异样本。为了同时提高模型的预测准确性和稳健性,本文提出了一种根据取样概率重新取样的多模型PLS方法,称为稳健共识PLS(RE-PLS)方法。该方法通过迭代赋权偏最小二乘法(IRPLS)计算样本回归残差得到每个校正集样本的取样概率,然后根据样本的取样概率来选择训练子集建立多个PLS模型,最后将所有PLS模型的预测结果平均作为最终预测结果。该方法用于两种不同植物样品的近红外光谱建模,并与传统的PLS及EPLS方法进行比较。结果表明该方法可以有效的避免校正集中奇异样本对模型的影响,同时可以提高预测精确度和稳健性。对于含有较多奇异样本的,复杂近红外光谱烟草实际样本,利用简单PLS或者EPLS方法建模预测效果不是很理想,而RE-PLS凭借其独特优势则有望在这种复杂光谱定量分析中得到广泛的应用。  相似文献   

15.
Within the framework of nonlinear partial least squares (PLS), the quadratic PLS regression approach, involving both linear and quadratic terms in the criterion, is discussed. A new algorithm for the determination of the components is proposed, and its advantages over the original algorithm are outlined. The approach of analysis is illustrated on the basis of simulated and real data. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

16.
The Partial least squares class model (PLSCM) was recently proposed for multivariate quality control based on a partial least squares (PLS) regression procedure. This paper presents a case study of quality control of peanut oils based on mid‐infrared (MIR) spectroscopy and class models, focusing mainly on the following aspects: (i) to explain the meanings of PLSCM components and make comparisons between PLSCM and soft independent modeling of class analogy (SIMCA); (ii) to correct the estimation of the original PLSCM confidence interval by considering a nonzero intercept term for center estimation; (iii) to investigate the potential of MIR spectroscopy combined with class models for identifying peanut oils with low doping concentrations of other edible oils. It is demonstrated that PLSCM is actually different from the ordinary PLS procedure, but it estimates the class center and class dispersion in the framework of a latent variable projection model. While SIMCA projects the original variables onto a few dimensions explaining most of the data variances, PLSCM components consider simultaneously the explained variances and the compactness of samples belonging to the same class. The analysis results indicate PLSCM is an intuitive and easy‐to‐use tool to tackle one‐class problems and has comparable performance with SIMCA. The advantages of PLSCM might be attributed to the great success and well‐established foundations of PLS. For PLSCM, the optimization of model complexity and estimation of decision region can be performed as in multivariate calibration routines. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

17.
This paper proposes a novel calibration technique based on combining support vector regression with a digital band pass (DBP) filter for the quantitative analysis of near‐infrared spectra. The efficacy of the proposed method is investigated and validated in the determination of glucose from near‐infrared spectra of a mixture composed of urea, triacetin and glucose. In this paper, the DBP filtering was implemented as a pre‐processing technique in the frequency domain as a Gaussian band pass filter and in the time domain as a Chebyshev filter. The grid‐search optimization method was used to optimize the filter parameters. The results demonstrate that utilization of the optimized DBP filters as a pre‐processing technique improved the performance of the predictive models. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

18.
Multiway principal component analysis (MPCA) has been extensively applied to batch process monitoring. In the case of monitoring a two‐stage batch process, the inter‐stage variation is neglected if MPCA models for each individual stage are used. On the other hand, if two stages of reference data are combined into a large dataset that MPCA is applied to, the dimensions of the unfolded matrix will increase dramatically. In addition, when an abnormal event is detected, it is difficult to identify which stage's operation induced this alarm. In this paper, partial least squares (PLS) is applied to monitor the inter‐stage relation of a two‐stage batch process. In post‐analysis of abnormalities, PLS can clarify whether root causes are from previous stage operations or due to the changes of the inter‐stage correlations. This approach was successfully applied to a semiconductor manufacturing process. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

19.
Abstract

Quantitative structure-activity relationship (QSAR) studies based on chemometric techniques are reviewed. Partial least squares (PLS) is introduced as a novel robust method to replace classical methods such as multiple linear regression (MLR). Advantages of PLS compared to MLR are illustrated with typical applications. Genetic algorithm (GA) is a novel optimization technique which can be used as a search engine in variable selection. A novel hybrid approach comprising GA and PLS for variable selection developed in our group (GAPLS) is described. The more advanced method for comparative molecular field analysis (CoMFA) modeling called GA-based region selection (GARGS) is described as well. Applications of GAPLS and GARGS to QSAR and 3D-QSAR problems are shown with some representative examples. GA can be hybridized with nonlinear modeling methods such as artificial neural networks (ANN) for providing useful tools in chemometric and QSAR.  相似文献   

20.
Optimized sample-weighted partial least squares   总被引:2,自引:0,他引:2  
Lu Xu 《Talanta》2007,71(2):561-566
In ordinary multivariate calibration methods, when the calibration set is determined to build the model describing the relationship between the dependent variables and the predictor variables, each sample in the calibration set makes the same contribution to the model, where the difference of representativeness between the samples is ignored. In this paper, by introducing the concept of weighted sampling into partial least squares (PLS), a new multivariate regression method, optimized sample-weighted PLS (OSWPLS) is proposed. OSWPLS differs from PLS in that it builds a new calibration set, where each sample in the original calibration set is weighted differently to account for its representativeness to improve the prediction ability of the algorithm. A recently suggested global optimization algorithm, particle swarm optimization (PSO) algorithm is used to search for the best sample weights to optimize the calibration of the original training set and the prediction of an independent validation set. The proposed method is applied to two real data sets and compared with the results of PLS, the most significant improvement is obtained for the meat data, where the root mean squared error of prediction (RMSEP) is reduced from 3.03 to 2.35. For the fuel data, OSWPLS can also perform slightly better or no worse than PLS for the prediction of the four analytes. The stability and efficiency of OSWPLS is also studied, the results demonstrate that the proposed method can obtain desirable results within moderate PSO cycles.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号