A strategy that iteratively retains informative variables for selecting optimal variable subset in multivariate calibration |
| |
Authors: | Yong-Huan Yun Wei-Ting Wang Min-Li Tan Yi-Zeng Liang Hong-Dong Li Dong-Sheng Cao Hong-Mei Lu Qing-Song Xu |
| |
Affiliation: | 1. College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, PR China;2. College of Pharmaceutical Sciences, Central South University, Changsha 410083, PR China;3. School of Mathematics and Statistics, Central South University, Changsha 410083, PR China |
| |
Abstract: | Nowadays, with a high dimensionality of dataset, it faces a great challenge in the creation of effective methods which can select an optimal variables subset. In this study, a strategy that considers the possible interaction effect among variables through random combinations was proposed, called iteratively retaining informative variables (IRIV). Moreover, the variables are classified into four categories as strongly informative, weakly informative, uninformative and interfering variables. On this basis, IRIV retains both the strongly and weakly informative variables in every iterative round until no uninformative and interfering variables exist. Three datasets were employed to investigate the performance of IRIV coupled with partial least squares (PLS). The results show that IRIV is a good alternative for variable selection strategy when compared with three outstanding and frequently used variable selection methods such as genetic algorithm-PLS, Monte Carlo uninformative variable elimination by PLS (MC-UVE-PLS) and competitive adaptive reweighted sampling (CARS). The MATLAB source code of IRIV can be freely downloaded for academy research at the website: http://code.google.com/p/multivariate-calibration/downloads/list. |
| |
Keywords: | Variable selection Informative variables Partial least squares Iteratively retaining informative variables Random combination |
本文献已被 ScienceDirect 等数据库收录! |
|