首页 | 本学科首页   官方微博 | 高级检索  
     检索      


On estimating model complexity and prediction errors in multivariate calibration: generalized resampling by random sample weighting (RSW)
Authors:L Xu  Q‐S Xu  M Yang  H‐Z Zhang  C‐B Cai  J‐H Jiang  H‐L Wu  R‐Q Yu
Abstract:The present paper focuses on determining the number of PLS components by using resampling methods such as cross validation (CV), Monte Carlo cross validation (MCCV), bootstrapping (BS), etc. To resample the training data, random non‐negative weights are assigned to the original training samples and a sample‐weighted PLS model is developed without increasing the computational burden much. Random weighting is a generalization of the traditional resampling methods and is expected to have a lower risk of getting an insufficient training set. For prediction, only the training samples with random weights less than a threshold value are selected to ensure that the prediction samples have less influence on training. For complicated data, because the optimal number of PLS components is often not unique or readily distinguished and there might exist an optimal region of model complexity, the distribution of prediction errors can be more useful than a single value of root mean squared error of prediction (RMSEP). Therefore, the distribution of prediction errors are estimated by repeated random sample weighting and used to determine model complexity. RSW is compared with its traditional counterparts like CV, MCCV, BS and a recently proposed randomization test method to demonstrate its usefulness. Copyright © 2010 John Wiley & Sons, Ltd.
Keywords:PLS model complexity  random sample weighting  cross validation  bootstrapping  randomization test
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号