An optimization‐based undeflated PLS (OUPLS) method to handle missing data in the training set |
| |
Authors: | Eranda Harinath Puwakkatiya‐Kankanamage,Salvador Garcí a‐Muñ oz,Lorenz T. Biegler |
| |
Abstract: | Advances in sensory systems have led to many industrial applications with large amounts of highly correlated data, particularly in chemical and pharmaceutical processes. With these correlated data sets, it becomes important to consider advanced modeling approaches built to deal with correlated inputs in order to understand the underlying sources of variability and how this variability will affect the final quality of the product. Additional to the correlated nature of the data sets, it is also common to find missing elements and noise in these data matrices. Latent variable regression methods such as partial least squares or projection to latent structures (PLS) have gained much attention in industry for their ability to handle ill‐conditioned matrices with missing elements. This feature of the PLS method is accomplished through the nonlinear iterative PLS (NIPALS) algorithm, with a simple modification to consider the missing data. Moreover, in expectation maximization PLS (EM‐PLS), imputed values are provided for missing data elements as initial estimates, conventional PLS is then applied to update these elements, and the process iterates to convergence. This study is the extension of previous work for principal component analysis (PCA), where we introduced nonlinear programming (NLP) as a means to estimate the parameters of the PCA model. Here, we focus on the parameters of a PLS model. As an alternative to modified NIPALS and EM‐PLS, this paper presents an efficient NLP‐based technique to find model parameters for PLS, where the desired properties of the parameters can be explicitly posed as constraints in the optimization problem of the proposed algorithm. We also present a number of simulation studies, where we compare effectiveness of the proposed algorithm with competing algorithms. Copyright © 2014 John Wiley & Sons, Ltd. |
| |
Keywords: | latent variables missing data PLS NIPALS nonlinear programming IPOPT |
|
|