首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper is about how to incorporate interaction effects in multi‐block methodologies. The method proposed is inspired by polynomial regression modelling in the case with only a few independent variables but extends/generalises the idea to situations where the blocks are potentially very large with respect to the number of variables. The method follows a so‐called type I sums of squares strategy where the linear effects (main effects) are incorporated sequentially and before the interactions. The sequential and orthogonalised partial least squares (SO‐PLS) technique is used as a basis for the proposal. The SO‐PLS method is based on sequential estimation of each new block by the PLS regression method after orthogonalisation with respect to blocks already fitted. The new method preserves the invariance already established for SO‐PLS and can be used for blocks with different dimensionality. The method is tested on one real data set with two independent blocks with different complexity and on a simulated data set with a large number of variables in each block. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

2.
Partial Least Squares (PLS) is a wide class of regression methods aiming at modelling relationships between sets of observed variables by means of latent variables. Specifically, PLS2 was developed to correlate two blocks of data, the X‐block representing the independent or explanatory variables and the Y‐block representing the dependent or response variables. Lately, OPLS was introduced to further reduce model complexity by removing Y‐orthogonal sources of variation from X in the latent space, thus improving data interpretation through the generated predictive latent variables. Nevertheless, relationships between PLS2 and OPLS in case of multiple Y‐response have not yet been fully explored. With this perspective and taking inspiration from some basic mathematical properties of PLS2, we here present a novel and general approach consisting in a post‐transformation of PLS2 (ptPLS2), which results in a decomposition of the latent space into orthogonal and predictive components, while preserving the same goodness of fit and predictive ability of PLS2. Additionally, we discuss the application of ptPLS2 approach to two metabolomic data sets extracted from earlier published studies and its advantages in model interpretation as compared with the ‘standard’ PLS approach. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

3.
The topic of this paper is regression models based on designed experiments, where additional spectroscopic measurements are also available. This particular case describes a situation with two spectral blocks with no natural order: The blocks are parallel. Three methods are described, which combine least squares regression of the design variables with PCA or PLS on the spectra. The methods properties are explored in two simulation studies based on real experiments. The results show that the methods are equal when it comes to prediction, but interpretability varies. One of the methods, LS‐ParPLS, is especially interesting when it comes to interpretability because it splits the spectral information into two parts; information that is common in both blocks and information that is unique for each block. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

4.
The on‐line monitoring of batch processes based on principal component analysis (PCA) has been widely studied. Nonetheless, researchers have not paid so much attention to the on‐line application of partial least squares (PLS). In this paper, the influence of several issues in the predictive power of a PLS model for the on‐line estimation of key variables in a batch process is studied. Some of the conclusions can help to better understand the capabilities of the proposals presented for on‐line PCA‐based monitoring. Issues like the convenience of batch‐wise or variable‐wise unfolding, the method for the imputation of future measurements and the use of several sub‐models are addressed. This is the first time that the adaptive hierarchical (or multi‐block) approach is extended to the PLS modelling. Also, the formulation of the so‐called trimmed scores regression (TSR), a powerful imputation method defined for PCA, is extended for its application with PLS modelling. Data from two processes, one simulated and one real, are used to illustrate the results. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

5.
6.
The particle size distribution of a solid product can be crucial parameter considering its application to different kinds of processes. The influence of particle size on near infrared (NIR) spectra has been used to develop effective alternative methods to traditional ones in order to determine this parameter. In this work, we used the chemometrical techniques partial least squares 2 (PLS2) and artificial neural networks (ANNs) to simultaneously predict several variables to the rapid construction of particle size distribution curves. The PLS2 algorithm relies on linear relations between variables, while the ANN technique can model non-linear systems.Samples were passed through sieves of different sieve opening in order to separate several size fractions that were used to construct two types of particle size distribution curves. The samples were recorded by NIR and their spectra were used with PLS2 and ANN to develop two calibration models for each. The correlation coefficients and relative standard errors of prediction (RSEP) have been used to assess the goodness of fit and accuracy of the results.The four calibration models studied provided statistically identical results based on RSEP values. Therefore, the combined use of NIR spectroscopy and PLS2 or ANN calibration models allows determining the particle size distributions accurately. The results obtained by ANN or PLS2 are statistically similar.  相似文献   

7.
After showing that plain covariance or correlation‐based criteria are generally not suitable to deal with multiple‐block component models in an exploratory framework, we propose an extended criterion: multiple co‐structure (MCS). MCS combines the goodness‐of‐fit indicator of the component model to a flexible measure of structural relevance of the components. Thus, it allows to track various kinds of interpretable structures within the data, on top of variance–maximizing components: variable‐bundles, components close to satisfying relevant structural constraints, and so on. MCS is to be maximised under unit‐norm constraints on coefficient‐vectors. We provide a dedicated ascent algorithm for it. This algorithm is nested into a more general one, named THEME (thematic equation model explorator), which calculates several components per data‐array and extracts nested structural component models. The method is tested on simulated data and applied to physicochemical data. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

8.

When X and Y are multivariate, the two-block partial least squares (PLS) method is often used. In this paper, we outline an extension addressing a special case of the three-block (X/Y/Z) problem, where Z sits "under" Y. We have called this approach three-block bi-focal PLS (3BIF-PLS). It views the X/Y relationship as the dominant problem, and seeks to use the additional information in Z in order to improve the interpretation of the Y-part of the X/Y association. Two data sets are used to illustrate 3BIF-PLS. Example I relates to single point mutants of haloalkane dehalogenase from Sphingomonas paucimobilis UT26 and their ability to transform halogenated hydrocarbons, some of which are found as organic pollutants in soil. Example II deals with soil remediation capability of bacteria. Whole bacterial communities are monitored over time using "DNA-fingerprinting" technology to see how pollution affects population composition. Since the data sets are large, hierarchical multivariate modelling is invoked to compress data prior to 3BIF-PLS analysis. It is concluded that the 3BIF-PLS approach works well. The paper contains a discussion of pros and cons of the method, and hints at further developmental opportunities.  相似文献   

9.
《Analytica chimica acta》2002,452(2):311-319
The characterisation of adsorption or impregnation processes using conventional or supercritical fluid technologies becomes an increasing part of the research on drug formulations. The complexity of the relationships between these adsorption processes and the experimental variables potentially influencing them, however, makes these studies more problematic. In this paper, a chemometric approach based on nonlinear partial least squares (NL-PLS) modelling is applied to characterise the effect of the experimental variables on the supercritical impregnation process. Various adsorbent materials such as silica gel, zeolite and amberlite were investigated using the following model compounds as adsorbates: benzoic, salicylic and acetylsalicylic acids.  相似文献   

10.
Plant‐wide process monitoring is challenging because of the complex relationships among numerous variables in modern industrial processes. The multi‐block process monitoring method is an efficient approach applied to plant‐wide processes. However, dividing the original space into subspaces remains an open issue. The loading matrix generated by principal component analysis (PCA) describes the correlation between original variables and extracted components and reveals the internal relations within the plant‐wide process. Thus, a multi‐block PCA method that constructs principal component (PC) sub‐blocks according to the generalized Dice coefficient of the loading matrix is proposed. The PCs corresponding to similar loading vectors are divided within the same sub‐block. Thus, the PCs in the same sub‐block share similar variational behavior for certain faults. This behavior improves the sensitivity of process monitoring in the sub‐block. A monitoring statistic T2 corresponding to each sub‐block is produced and is integrated into the final probability index based on Bayesian inference. A corresponding contribution plot is also developed to identify the root cause. The superiority of the proposed method is demonstrated by two case studies: a numerical example and the Tennessee Eastman benchmark. Comparisons with other PCA‐based methods are also provided. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

11.
12.
Extension of standard regression to the case of multiple regressor arrays is given via the Kronecker product. The method is illustrated using ordinary least squares regression (OLS) as well as the latent variable (LV) methods principal component regression (PCR) and partial least squares regression (PLS). Denoting the method applied to PLS as mrPLS, the latter was shown to explain as much or more variance for the first LV relative to the comparable L‐partial least squares regression (L‐PLS) model. The same relationship holds when mrPLS is compared to PLS or n‐way partial least squares (N‐PLS) and the response array is 2‐way or 3‐way, respectively, where the regressor array corresponding to the first mode of the response array is 2‐way and the second mode regressor array is an identity matrix. In a comparison with N‐PLS using fragrance data, mrPLS proved superior in a validation sense when model selection was used. Though the focus is on 2‐way regressor arrays, the method can be applied to n‐way regressors via N‐PLS. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

13.
A novel projection modeling method for quantitative structure activity relationship (QSAR) and quantitative structure property relationship (QSPR) is developed in this paper. Orthogonalization of block variables is introduced to deal with the problem of variable selection. Projections based on least squares are used to construct the modeling space in order to search for the best regression directions for chemical modeling. A suitable prediction space for such a model is further defined to confine the usage range of the model. Three real data sets were analyzed to check the performance of the proposed modeling method. The results obtained from Monte‐Carlo cross‐validation (MCCV) showed that the proposed modeling method might provide better results for QSAR and QSPR modeling than PCR and PLS with respect to both fitting and prediction abilities. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

14.
This paper proposes a new method for determining the subset of variables that reproduce as well as possible the main structural features of the complete data set. This method can be useful for pre-treatment of large data sets since it allows discarding variables that contain redundant information. Reducing the number of variables often allows one to better investigate data structure and obtain more stable results from multivariate modelling methods.The novel method is based on the recently proposed canonical measure of correlation (CMC index) between two sets of variables [R. Todeschini, V. Consonni, A. Manganaro, D. Ballabio, A. Mauri, Canonical Measure of Correlation (CMC) and Canonical Measure of Distance (CMD) between sets of data. Part 1. Theory and simple chemometric applications, Anal. Chim. Acta submitted for publication (2009)]. Following a stepwise procedure (backward elimination), each variable in turn is compared to all the other variables and the most correlated is definitively discarded. Finally, a key subset of variables being as orthogonal as possible are selected. The performance was evaluated on both simulated and real data sets. The effectiveness of the novel method is discussed by comparison with results of other well known methods for variable reduction, such as Jolliffe techniques, McCabe criteria, Krzanowski approach and its modification based on genetic algorithms, loadings of the first principal component, Key Set Factor Analysis (KSFA), Variable Inflation Factor (VIF), pairwise correlation approach, and K correlation analysis (KIF). The obtained results are consistent with those of the other considered methods; moreover, the advantage of the proposed CMC method is that calculation is very quick and can be easily implemented in any software application.  相似文献   

15.
In Bayesian networks it is necessary to compute relationships between continuous nodes. The standard Bayesian network methodology represents this dependency with a linear regression model whose parameters are estimated by a maximum likelihood (ML) calculation. Partial least-squares (PLS) is proposed as an alternative method for computing the model parameters. This new hybrid method is termed PLS-Bayes, as it uses PLS to calculate regression vectors for a Bayesian network. This alternative approach requires storing the raw data matrix rather than sequentially updating sufficient statistics, but results in a regression matrix that predicts with higher accuracy, requires less training data, and performs well in large networks.  相似文献   

16.
Simultaneous determination of uranium and thorium using arsenazo III as a chromogenic reagent at pH 1.70 by H‐point standard addition method (HPSAM) and partial least squares (PLS) calibration is described. Under optimum conditions, the simultaneous determinations of uranium and thorium by HPSAM were performed. The absorbencies at one pair of wavelengths, 649 and 669 nm, were monitored with the addition of standard solutions of uranium. The results of applying the HPSAM showed that uranium and thorium can be determined simultaneously with weight concentration ratios of uranium to thorium varying from 20:1 to 1:15 in the mixed sample. By multivariate calibration methods such as PLS, it is possible to obtain a model adjusted to the concentration values of the mixtures used in the calibration range. In this study, the calibration model is based on absorption spectra in the 600–750 nm range for 25 different mixtures of uranium and thorium. Calibration matrices contained 0.10–21.00 and 0.25–18.5 μg mL?1 of uranium and thorium, respectively. The RMSEP for uranium and thorium were 0.7400 and 0.7276, respectively. Both proposed methods (HPSAM and PLS) were also successfully applied to the determination of uranium and thorium in several synthetic and real matrix samples.  相似文献   

17.
《Comptes Rendus Chimie》2018,21(12):1170-1178
The basic model for thermal spin crossover (SCO) is discussed in its microscopic and thermodynamic formulation. Compared to the basic model, its more elaborated forms formulated in course of almost 50 years are briefly reviewed with emphasis on their additional features. A separate section is devoted to the newer developments in the field of modelling of the SCO nanoparticles. The presentation of models is led in a comparative way to provide an accessible outline of the foundations of modern theoretical research on SCO and a simple applicability in quantitative interpretation of experiments.  相似文献   

18.
Run to run (R2R) optimization based on unfolded Partial Least Squares (u‐PLS) is a promising approach for improving the performance of batch and fed‐batch processes as it is able to continuously adapt to changing processing conditions. Using this technique, the regression coefficients of PLS are used to modify the input profile of the process in order to optimize the yield. When this approach was initially proposed, it was observed that the optimization performed better when PLS was combined with a smoothing technique, in particular a sliding window filtering, which constrained the regression coefficients to be smooth. In the present paper, this result is further investigated and some modifications to the original approach are proposed. Also, the suitability of different smoothing techniques in combination with PLS is studied for both end‐of‐batch quality prediction and R2R optimization. The smoothing techniques considered in this paper include the original filtering approach, the introduction of smoothing constraints in the PLS calibration (Penalized PLS), and the use of functional analysis (Functional PLS). Two fed‐batch process simulators are used to illustrate the results. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

19.
In this work, we present a new approach to path modeling based on an extended multiple covariance criterion: system extended multiple covariance (SEMC). SEMC is suitable to measure the quality of any structural equations system. We show why SEMC may be preferred to criteria based on usual covariance of components and also to criteria based on residual sums of squares. We give a pursuit algorithm ensuring that SEMC increases and converges. When one wishes to extract more than one component per variable group, a problem arises of component hierarchy. To solve it, we define a local nesting principle of component models that makes the role of each component statistically clear. We then embed the pursuit algorithm in a more general algorithm that extracts sequences of locally nested models. We finally provide a component backward selection strategy. The technique is applied to cigarette data to model the generation of chemical compounds in smoke through tobacco combustion. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号