Selecting informative rules with parallel genetic algorithm in classification problem |
| |
Authors: | Bikash Kanti Sarkar Shib Sankar Sana |
| |
Affiliation: | a Department of Information Technology, B.I.T., Mesra, Ranchi-835 215, Jharkhand, India b Department of Mathematics, Bhangar Mahavidyalaya(C.U.), Bhangar-743 502, 24-Pgs(S), W.B., India c Department of Mathematics, Jadavpur University, Kolkata-32, India |
| |
Abstract: | The classification system is very important for making decision and it has been attracted much attention of many researchers. Usually, the traditional classifiers are either domain specific or produce unsatisfactory results over classification problems with larger size and imbalanced data. Hence, genetic algorithms (GA) are recently being combined with traditional classifiers to find useful knowledge for making decision. Although, the main concerns of such GA-based system are the coverage of less search space and increase of computational cost with the growth of population. In this paper, a rule-based knowledge discovery model, combining C4.5 (a Decision Tree based rule inductive algorithm) and a new parallel genetic algorithm based on the idea of massive parallelism, is introduced. The prime goal of the model is to produce a compact set of informative rules from any kind of classification problem. More specifically, the proposed model receives a base method C4.5 to generate rules which are then refined by our proposed parallel GA. The strength of the developed system has been compared with pure C4.5 as well as the hybrid system (C4.5 + sequential genetic algorithm) on six real world benchmark data sets collected from UCI (University of California at Irvine) machine learning repository. Experiments on data sets validate the effectiveness of the new model. The presented results especially indicate that the model is powerful for volumetric data set. |
| |
Keywords: | Classification Accuracy C4.5 Parallel genetic algorithm |
本文献已被 ScienceDirect 等数据库收录! |
|