首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
When studying the principal component analysis (PCA) or partial least squares (PLS) modelling of batch process data, one realizes that there is a wide range of approaches. In many cases, new modelling approaches are presented just because they work properly for a particular application, for example, on‐line monitoring and a given number of processes. A clear understanding of why these approaches perform successfully and which are the advantages and disadvantages in front of the others is seldom supplied. Why does modelling after batch‐wise unfolding capture changing dynamics? What are the consequences of variable‐wise unfolding? Is there any best unfolding method? When should several models for a single process be used? In this paper, it is shown how these and other related questions can be answered by properly analyzing the dynamic covariance structures of the various approaches. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

2.
Maximum likelihood principal component analysis (MLPCA) was originally proposed to incorporate measurement error variance information in principal component analysis (PCA) models. MLPCA can be used to fit PCA models in the presence of missing data, simply by assigning very large variances to the non‐measured values. An assessment of maximum likelihood missing data imputation is performed in this paper, analysing the algorithm of MLPCA and adapting several methods for PCA model building with missing data to its maximum likelihood version. In this way, known data regression (KDR), KDR with principal component regression (PCR), KDR with partial least squares regression (PLS) and trimmed scores regression (TSR) methods are implemented within the MLPCA method to work as different imputation steps. Six data sets are analysed using several percentages of missing data, comparing the performance of the original algorithm, and its adapted regression‐based methods, with other state‐of‐the‐art methods. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

3.
In chemical and biochemical processes, steady‐state models are widely used for process assessment, control and optimisation. In these models, parameter adjustment requires data collected under nearly steady‐state conditions. Several approaches have been developed for steady‐state identification (SSID) in continuous processes, but no attempt has been made to adapt them to the singularities of batch processes. The main aim of this paper is to propose an automated method based on batch‐wise unfolding of the three‐way batch process data followed by a principal component analysis (Unfold‐PCA) in combination with the methodology of Brown and Rhinehart 2 for SSID. A second goal of this paper is to illustrate how by using Unfold‐PCA, process understanding can be gained from the batch‐to‐batch start‐ups and transitions data analysis. The potential of the proposed methodology is illustrated using historical data from a laboratory‐scale sequencing batch reactor (SBR) operated for enhanced biological phosphorus removal (EBPR). The results demonstrate that the proposed approach can be efficiently used to detect when the batches reach the steady‐state condition, to interpret the overall batch‐to‐batch process evolution and also to isolate the causes of changes between batches using contribution plots. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

4.
Cross‐validation has become one of the principal methods to adjust the meta‐parameters in predictive models. Extensions of the cross‐validation idea have been proposed to select the number of components in principal components analysis (PCA). The element‐wise k‐fold (ekf) cross‐validation is among the most used algorithms for principal components analysis cross‐validation. This is the method programmed in the PLS_Toolbox, and it has been stated to outperform other methods under most circumstances in a numerical experiment. The ekf algorithm is based on missing data imputation, and it can be programmed using any method for this purpose. In this paper, the ekf algorithm with the simplest missing data imputation method, trimmed score imputation, is analyzed. A theoretical study is driven to identify in which situations the application of ekf is adequate and, more importantly, in which situations it is not. The results presented show that the ekf method may be unable to assess the extent to which a model represents a test set and may lead to discard principal components with important information. On a second paper of this series, other imputation methods are studied within the ekf algorithm. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

5.
Run to run (R2R) optimization based on unfolded Partial Least Squares (u‐PLS) is a promising approach for improving the performance of batch and fed‐batch processes as it is able to continuously adapt to changing processing conditions. Using this technique, the regression coefficients of PLS are used to modify the input profile of the process in order to optimize the yield. When this approach was initially proposed, it was observed that the optimization performed better when PLS was combined with a smoothing technique, in particular a sliding window filtering, which constrained the regression coefficients to be smooth. In the present paper, this result is further investigated and some modifications to the original approach are proposed. Also, the suitability of different smoothing techniques in combination with PLS is studied for both end‐of‐batch quality prediction and R2R optimization. The smoothing techniques considered in this paper include the original filtering approach, the introduction of smoothing constraints in the PLS calibration (Penalized PLS), and the use of functional analysis (Functional PLS). Two fed‐batch process simulators are used to illustrate the results. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

6.
This paper presents a new approach to path modelling, based on a sequential multi‐block modelling in latent variables. The approach is explorative and focused on interpretation. The method breaks with standard traditions of estimating all paths using one single modelling. Instead, one separate model is estimated for each endogenous block. Each separate model is constructed by stepwise use of the standard PLS regression on matrices that are orthogonalised with respect to each other. The advantages of the approach are that it can allow for different dimensionality within each block, it is invariant to relative weighting of the blocks and it is based on simple and standard methodology allowing for simple outlier detection, validation and interpretation. No convergence problems are involved and the method can be used for situations with many more variables than samples. An application based on sensory analysis of wines will be used to illustrate the method. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

7.
It is well known that the predictions of the single response orthogonal projections to latent structures (OPLS) and the single response partial least squares regression (PLS1) regression are identical in the single‐response case. The present paper presents an approach to identification of the complete y ‐orthogonal structure by starting from the viewpoint of standard PLS1 regression. Three alternative non‐deflating OPLS algorithms and a modified principal component analysis (PCA)‐driven method (including MATLAB code) is presented. The first algorithm implements a postprocessing routine of the standard PLS1 solution where QR factorization applied to a shifted version of the non‐orthogonal scores is the key to express the OPLS solution. The second algorithm finds the OPLS model directly by an iterative procedure. By a rigorous mathematical argument, we explain that orthogonal filtering is a ‘built‐in’ property of the traditional PLS1 regression coefficients. Consequently, the capabilities of OPLS with respect to improving the predictions (also for new samples) compared with PLS1 are non‐existing. The PCA‐driven method is based on the fact that truncating off one dimension from the row subspace of X results in a matrix X orth with y ‐orthogonal columns and a rank of one less than the rank of X . The desired truncation corresponds exactly to the first X deflation step of Martens non‐orthogonal PLS algorithm. The significant y ‐orthogonal structure of X found by PCA of X orth is split into two fundamental parts: one part that is significantly contributing to correct the first PLS score toward y and one part that is not. The third and final OPLS algorithm presented is a modification of Martens non‐orthogonal algorithm into an efficient dual PLS1–OPLS algorithm. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

8.
The insight from, and conclusions of this paper motivate efficient and numerically robust ‘new’ variants of algorithms for solving the single response partial least squares regression (PLS1) problem. Prototype MATLAB code for these variants are included in the Appendix. The analysis of and conclusions regarding PLS1 modelling are based on a rich and nontrivial application of numerous key concepts from elementary linear algebra. The investigation starts with a simple analysis of the nonlinear iterative partial least squares (NIPALS) PLS1 algorithm variant computing orthonormal scores and weights. A rigorous interpretation of the squared P ‐loadings as the variable‐wise explained sum of squares is presented. We show that the orthonormal row‐subspace basis of W ‐weights can be found from a recurrence equation. Consequently, the NIPALS deflation steps of the centered predictor matrix can be replaced by a corresponding sequence of Gram–Schmidt steps that compute the orthonormal column‐subspace basis of T ‐scores from the associated non‐orthogonal scores. The transitions between the non‐orthogonal and orthonormal scores and weights (illustrated by an easy‐to‐grasp commutative diagram), respectively, are both given by QR factorizations of the non‐orthogonal matrices. The properties of singular value decomposition combined with the mappings between the alternative representations of the PLS1 ‘truncated’ X data (including P t W ) are taken to justify an invariance principle to distinguish between the PLS1 truncation alternatives. The fundamental orthogonal truncation of PLS1 is illustrated by a Lanczos bidiagonalization type of algorithm where the predictor matrix deflation is required to be different from the standard NIPALS deflation. A mathematical argument concluding the PLS1 inconsistency debate (published in 2009 in this journal) is also presented. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

9.
Freeze-drying or lyophilisation is a batch wise industrial process used to remove water from solutions, hence stabilizing the solutes for distribution and storage. The objective of the present work was to outline a batch modelling approach to monitor a freeze-drying process in-line and in real-time using Raman spectroscopy. A 5% (w/v) d-mannitol solution was freeze-dried in this study as model. The monitoring of a freeze-drying process using Raman spectroscopy allows following the product behaviour and some process evolution aspects by detecting the changes of the solutes and solvent occurring during the process. Herewith, real-time solid-state characterization of the final product is also possible.The timely spectroscopic measurements allowed the differentiation between batches operated in normal process conditions and batches having deviations from the normal trajectory. Two strategies were employed to develop batch models: partial least squares (PLS) using the unfolded data and parallel factor analysis (PARAFAC). It was shown that both strategies were able to developed batch models using in-line Raman spectroscopy, allowing to monitor the evolution in real-time of new batches. However, the computational effort required to develop the PLS model and to evaluate new batches using this model is significant lower compared to the PARAFAC model. Moreover, PLS scores in the time mode can be computed for new batches, while using PARAFAC only the batch mode scores can be determined for new batches.  相似文献   

10.
Batch wise scale-up of Buchwald-Hartwig aminations under microwave irradiation has been investigated for the first time. Multi-mode (microSYNTH and MARS) (several vessels irradiated in parallel per batch) as well as single-mode (Discover) (one vessel irradiated per batch) platforms can be successfully used for this purpose with trifluoromethylbenzene (benzotrifluoride: BTF) as amination solvent. The obtained yields indicate a direct scalability in BTF for all the studied aminations. The Voyager equipment (based on a Discover platform) is the most convenient system since it allows an automatic continuous batch wise production without the necessity to manually load and unload reaction vessels.  相似文献   

11.
The performance of Partial Least Squares regression (PLS) in predicting the output with multivariate cross‐ and autocorrelated data is studied. With many correlated predictors of varying importance PLS does not always predict well and we propose a modified algorithm, Partitioned Partial Least Squares (PPLS). In PPLS the predictors are partitioned into smaller subgroups and the important subgroups with high prediction power are identified. Finally, regular PLS analysis using only those subgroups is performed. The proposed Partitioned PLS (PPLS) algorithm is used in the analysis of data from a real pharmaceutical batch fermentation process for which the process variables follow certain profiles during a specific fermentation period. We observed that PPLS leads to a more accurate prediction of the yield of the fermentation process and an easier interpretation, since fewer predictors are used in the final PLS prediction. In the application important issues such as alignment of the profiles from one batch to another and standardization of the predictors are also addressed. For instance, in PPLS noise magnification due to standardization does not seem to create problems as it might in regular PLS. Finally, PPLS is compared to several recently proposed functional PLS and PCR methods and a genetic algorithm for variable selection. More specifically for a couple of publicly available data sets with near infrared spectra it is shown that overall PPLS has lower cross‐validated error than PLS, PCR and the functional modifications hereof, and is similar in performance to a more complex genetic algorithm. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

12.
A new procedure has been developed for the classification and quantification of the adulteration of pure olive oil by soya oil, sun flower oil, corn oil, walnut oil and hazelnut oil. The study was based on a chemometric analysis of the near-infrared (NIR) spectra of olive-oil mixtures containing different adulterants. The adulteration of olive oil was carefully carried out gravimetrically in a 4 mm quartz cuvette, starting with pure olive oil in the cuvette first. NIR spectra of the 525 adulterated mixtures were measured in the region of 12,000-4000 cm(-1). The spectra were subjected batch wise to multiplicative signal correction (MSC) before calculating the principal component (PCA) models. The MSC-corrected data were subjected to Savitzky-Golay smoothing and a mean normalization procedure before developing partial least-squares calibration (PLS) models. The results revealed that the models predicted the adulterants, corn oil, sun flower oil, soya oil, walnut oil and hazelnut oil involved in olive oil with error limits +/-0.57, +/-1.32, +/-0.96, +/-0.56 and +/-0.57% weight/weight, respectively. Furthermore, the PCA developed models were able to classify unknown adulterated olive oil mixtures with almost 100% certainty. Quantification of the adulterants was carried out using their respective PLS models within the same error limits as mentioned above.  相似文献   

13.
This paper is about how to incorporate interaction effects in multi‐block methodologies. The method proposed is inspired by polynomial regression modelling in the case with only a few independent variables but extends/generalises the idea to situations where the blocks are potentially very large with respect to the number of variables. The method follows a so‐called type I sums of squares strategy where the linear effects (main effects) are incorporated sequentially and before the interactions. The sequential and orthogonalised partial least squares (SO‐PLS) technique is used as a basis for the proposal. The SO‐PLS method is based on sequential estimation of each new block by the PLS regression method after orthogonalisation with respect to blocks already fitted. The new method preserves the invariance already established for SO‐PLS and can be used for blocks with different dimensionality. The method is tested on one real data set with two independent blocks with different complexity and on a simulated data set with a large number of variables in each block. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

14.
A new method of imputation for left‐censored datasets is reported. This method is evaluated by examining datasets in which the true values of the censored data are known so that the quality of the imputation can be assessed both visually and by means of cluster analysis. Its performance in retaining certain data structures on imputation is compared with that of three other imputation algorithms by using cluster analysis on the imputed data. It is found that the new imputation method benefits a subsequent model‐based cluster analysis performed on the left‐censored data. The stochastic nature of the imputations performed in the new method can provide multiple imputed sets from the same incomplete data. The analysis of these provides an estimate of the uncertainty of the cluster analysis. Results from clustering suggest that the imputation is robust, with smaller uncertainty than that obtained from other multiple imputation methods applied to the same data. In addition, the use of the new method avoids problems with ill‐conditioning of group covariances during imputation as well as in the subsequent clustering based on expectation–maximization. The strong imputation performance of the proposed method on simulated datasets becomes more apparent as the groups in the mixture models are increasingly overlapped. Results from real datasets suggest that the best performance occurs when the requirement of normality of each group is fulfilled, which is the main assumption of the new method. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

15.
Advances in sensory systems have led to many industrial applications with large amounts of highly correlated data, particularly in chemical and pharmaceutical processes. With these correlated data sets, it becomes important to consider advanced modeling approaches built to deal with correlated inputs in order to understand the underlying sources of variability and how this variability will affect the final quality of the product. Additional to the correlated nature of the data sets, it is also common to find missing elements and noise in these data matrices. Latent variable regression methods such as partial least squares or projection to latent structures (PLS) have gained much attention in industry for their ability to handle ill‐conditioned matrices with missing elements. This feature of the PLS method is accomplished through the nonlinear iterative PLS (NIPALS) algorithm, with a simple modification to consider the missing data. Moreover, in expectation maximization PLS (EM‐PLS), imputed values are provided for missing data elements as initial estimates, conventional PLS is then applied to update these elements, and the process iterates to convergence. This study is the extension of previous work for principal component analysis (PCA), where we introduced nonlinear programming (NLP) as a means to estimate the parameters of the PCA model. Here, we focus on the parameters of a PLS model. As an alternative to modified NIPALS and EM‐PLS, this paper presents an efficient NLP‐based technique to find model parameters for PLS, where the desired properties of the parameters can be explicitly posed as constraints in the optimization problem of the proposed algorithm. We also present a number of simulation studies, where we compare effectiveness of the proposed algorithm with competing algorithms. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

16.
Polyhedral oligomeric silsesquioxane (POSS) particles are one of the smallest organosilica nano‐cage structures with high multifunctionality that show both organic and inorganic properties. Until now poly(POSS) structures have been synthesized from beginning with a methacryl‐POSS monomer in free‐radical mechanism with batch‐wise methods that use sacrificial templates or additional multisteps. This study introduces a novel one‐pot synthesis inside a continuous flow, double temperature zone microfluidic reactor where the methodology is based on dispersion polymerization. As a result, spherical monodisperse POSS microparticles were obtained and characterized to determine their morphology, surface chemical structure, and thermal behavior by SEM, FTIR, and TGA, respectively. These results were also compared and reported with the outcomes of batch‐wise synthesis. © 2019 Wiley Periodicals, Inc. J. Polym. Sci., Part A: Polym. Chem. 2019, 57, 1396–1403  相似文献   

17.
The nearest shrunken centroid (NSC) Classifier is successfully applied for class prediction in a wide range of studies based on microarray data. The contribution from seemingly irrelevant variables to the classifier is minimized by the so‐called soft‐thresholding property of the approach. In this paper, we first show that for the two‐class prediction problem, the NSC Classifier is similar to a one‐component discriminant partial least squares (PLS) model with soft‐shrinkage of the loading weights. Then we introduce the soft‐threshold‐PLS (ST‐PLS) as a general discriminant‐PLS model with soft‐thresholding of the loading weights of multiple latent components. This method is especially suited for classification and variable selection when the number of variables is large compared to the number of samples, which is typical for gene expression data. A characteristic feature of ST‐PLS is the ability to identify important variables in multiple directions in the variable space. Both the ST‐PLS and the NSC classifiers are applied to four real data sets. The results indicate that ST‐PLS performs better than the shrunken centroid approach if there are several directions in the variable space which are important for classification, and there are strong dependencies between subsets of variables. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

18.
New multivariate calibration methods and other processes are being developed that require selection of multiple tuning parameter (penalty) values to form the final model. With one or more tuning parameters, using only one measure of model quality to select final tuning parameter values is not sufficient. Optimization of several model quality measures is challenging. Thus, three fusion ranking methods are investigated for simultaneous assessment of multiple measures of model quality for selecting tuning parameter values. One is a supervised learning fusion rule named sum of ranking differences (SRD). The other two are non-supervised learning processes based on the sum and median operations. The effect of the number of models evaluated on the three fusion rules are also evaluated using three procedures. One procedure uses all models from all possible combinations of the tuning parameters. To reduce the number of models evaluated, an iterative process (only applicable to SRD) is applied and thresholding a model quality measure before applying the fusion rules is also used. A near infrared pharmaceutical data set requiring model updating is used to evaluate the three fusion rules. In this case, calibration of the primary conditions is for the active pharmaceutical ingredient (API) of tablets produced in a laboratory. The secondary conditions for calibration updating is for tablets produced in the full batch setting. Two model updating processes requiring selection of two unique tuning parameter values are studied. One is based on Tikhonov regularization (TR) and the other is a variation of partial least squares (PLS). The three fusion methods are shown to provide equivalent and acceptable results allowing automatic selection of the tuning parameter values. Best tuning parameter values are selected when model quality measures used with the fusion rules are for the small secondary sample set used to form the updated models. In this model updating situation, evaluation of all possible models, thresholding, and iterative SRD performed equivalently for the three fusion rules with TR and PLS performed worse. While the application is model updating, the fusion processes are applicable to other situations requiring selection of multiple tuning parameter values.  相似文献   

19.
20.
This work explores a novel method for rearranging 1st order (one-way) infra-red (IR) and/or near infra-red (NIR) ordinary spectra into a representation suitable for multi-way modelling and analysis. The method is based on the fact that the fundamental IR absorption and the first, second, and consecutive overtones of NIR absorptions represent identical chemical information. It is therefore possible to rearrange these overtone regions of the vectors comprising an IR and NIR spectrum into a matrix where the fundamental, 1st, 2nd, and consecutive overtones of the spectrum are arranged as either rows or columns in a matrix, resulting in a true three-way tensor of data for several samples. This tensorization facilitates explorative analysis and modelling with multi-way methods, for example parallel factor analysis (PARAFAC), N-way partial least squares (N-PLS), and Tucker models. The vibrational overtone combination spectroscopy (VOCSY) arrangement is shown to benefit from the “order advantage”, producing more robust, stable, and interpretable models than, for example, the traditional PLS modelling method. The proposed method also opens the field of NIR for true peak decomposition—a feature unique to the method because the latent factors acquired using PARAFAC can represent pure spectral components whereas latent factors in principal component analysis (PCA) and PLS usually do not.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号