GA-Ensemble: a genetic algorithm for robust ensembles |
| |
Authors: | Dong-Yop Oh J. Brian Gray |
| |
Affiliation: | 1. Computer Information Systems and Quantitative Methods Department, The University of Texas, Pan American, Edinburg, TX, 78539-2999, USA 2. Department of Information Systems, Statistics and Management Science, The University of Alabama, Tuscaloosa, AL, 35487-0226, USA
|
| |
Abstract: | Many simple and complex methods have been developed to solve the classification problem. Boosting is one of the best known techniques for improving the accuracy of classifiers. However, boosting is prone to overfitting with noisy data and the final model is difficult to interpret. Some boosting methods, including AdaBoost, are also very sensitive to outliers. In this article we propose a new method, GA-Ensemble, which directly solves for the set of weak classifiers and their associated weights using a genetic algorithm. The genetic algorithm utilizes a new penalized fitness function that limits the number of weak classifiers and controls the effects of outliers by maximizing an appropriately chosen $p$ th percentile of margins. We compare the test set error rates of GA-Ensemble, AdaBoost, and GentleBoost (an outlier-resistant version of AdaBoost) using several artificial data sets and real-world data sets from the UC-Irvine Machine Learning Repository. GA-Ensemble is found to be more resistant to outliers and results in simpler predictive models than AdaBoost and GentleBoost. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|