首页 | 本学科首页   官方微博 | 高级检索  
     检索      


GA-Ensemble: a genetic algorithm for robust ensembles
Authors:Dong-Yop Oh  J Brian Gray
Institution:1. Computer Information Systems and Quantitative Methods Department, The University of Texas, Pan American, Edinburg, TX, 78539-2999, USA
2. Department of Information Systems, Statistics and Management Science, The University of Alabama, Tuscaloosa, AL, 35487-0226, USA
Abstract:Many simple and complex methods have been developed to solve the classification problem. Boosting is one of the best known techniques for improving the accuracy of classifiers. However, boosting is prone to overfitting with noisy data and the final model is difficult to interpret. Some boosting methods, including AdaBoost, are also very sensitive to outliers. In this article we propose a new method, GA-Ensemble, which directly solves for the set of weak classifiers and their associated weights using a genetic algorithm. The genetic algorithm utilizes a new penalized fitness function that limits the number of weak classifiers and controls the effects of outliers by maximizing an appropriately chosen $p$ th percentile of margins. We compare the test set error rates of GA-Ensemble, AdaBoost, and GentleBoost (an outlier-resistant version of AdaBoost) using several artificial data sets and real-world data sets from the UC-Irvine Machine Learning Repository. GA-Ensemble is found to be more resistant to outliers and results in simpler predictive models than AdaBoost and GentleBoost.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号