A bootstrapping soft shrinkage approach for variable selection in chemical modeling |
| |
Authors: | Bai-Chuan Deng Yong-Huan Yun Dong-Sheng Cao Yu-Long Yin Wei-Ting Wang Hong-Mei Lu Qian-Yi Luo Yi-Zeng Liang |
| |
Institution: | 1. College of Animal Science, South China Agricultural University, Guangzhou 510642, PR China;2. School of Chemistry and Chemical Engineering, Central South University, Changsha 410083, PR China;3. School of Pharmaceutical Sciences, Central South University, Changsha 410083, PR China;4. Institute of Subtropical Agriculture, Chinese Academy of Sciences, Changsha 410125, PR China |
| |
Abstract: | In this study, a new variable selection method called bootstrapping soft shrinkage (BOSS) method is developed. It is derived from the idea of weighted bootstrap sampling (WBS) and model population analysis (MPA). The weights of variables are determined based on the absolute values of regression coefficients. WBS is applied according to the weights to generate sub-models and MPA is used to analyze the sub-models to update weights for variables. The optimization procedure follows the rule of soft shrinkage, in which less important variables are not eliminated directly but are assigned smaller weights. The algorithm runs iteratively and terminates until the number of variables reaches one. The optimal variable set with the lowest root mean squared error of cross-validation (RMSECV) is selected. The method was tested on three groups of near infrared (NIR) spectroscopic datasets, i.e. corn datasets, diesel fuels datasets and soy datasets. Three high performing variable selection methods, i.e. Monte Carlo uninformative variable elimination (MCUVE), competitive adaptive reweighted sampling (CARS) and genetic algorithm partial least squares (GA-PLS) are used for comparison. The results show that BOSS is promising with improved prediction performance. The Matlab codes for implementing BOSS are freely available on the website: http://www.mathworks.com/matlabcentral/fileexchange/52770-boss. |
| |
Keywords: | Variable selection Model population analysis Weighted bootstrap sampling Soft shrinkage and partial least squares |
本文献已被 ScienceDirect 等数据库收录! |
|