首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Detecting “bad” regression models: multicriteria fitness functions in regression analysis
Authors:Roberto Todeschini  Viviana Consonni  Andrea Mauri  Manuela Pavan
Institution:Milano Chemometrics and QSAR Research Group, Department of Environmental Sciences, University of Milano-Bicocca, P.za della Scienza, 1, 20126 Milano, Italy
Abstract:Regression models with good fitting but no predictive ability are sometimes chance correlations and often show some pathological features such as multicollinearity, overfitting, and inclusion of noisy/spurious variables. This problem is well known and of the utmost importance. The present paper proposes some criteria that are to be fulfilled as conditions for model acceptability, the aim being to recognize linear regression models with pathology. These criteria have been thought of in order to face the following problems:
model instability due to outliers and influential objects;
predictor multicollinearity;
redundancy in explanatory variables;
overfitting due to chance factors.
A multicriteria fitness function based on the maximization of the Q2 statistics under a set of tests is proposed here. This new fitness function can also be used in model searching by variable selection approaches in order to obtain a final optimal population of models. Computations on the Selwood data set are reported to illustrate the use of this multicriteria fitness function in model searching.
Keywords:Regression analysis  Multicriteria decision making  Variable selection  Selwood data set
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号