首页 | 本学科首页   官方微博 | 高级检索  
     检索      


On multivariate calibration with unlabeled data
Authors:Paman Gujral  Michael Amrhein  Rolf Ergon  Barry M Wise  Dominique Bonvin
Abstract:In principal component regression (PCR) and partial least‐squares regression (PLSR), the use of unlabeled data, in addition to labeled data, helps stabilize the latent subspaces in the calibration step, typically leading to a lower prediction error. For using unlabeled data in PLSR, a non‐sequential approach based on optimal filtering (OF) has been proposed in the literature. In this work, a sequential version of the OF‐based PLSR and a PCA‐based PLSR (PLSR applied to PCA‐preprocessed data) are proposed. It is shown analytically that the sequential version of the OF‐based PLSR is equivalent to that of PCA‐based PLSR, which leads to a new interpretation of OF. Simulated and experimental data sets are used to point out the usefulness and pitfalls of using unlabeled data. Unlabeled data can replace labeled data to some extent, thereby leading to an economic benefit. However, in the presence of drift, the use of unlabeled data can result in an increase in prediction error compared to that obtained with a model based on labeled data alone. Copyright © 2011 John Wiley & Sons, Ltd.
Keywords:multivariate calibration  semi‐supervised learning  unlabeled data  optimal filtering  drift
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号