Reformulation of the support set selection problem in the logical analysis of data |
| |
Authors: | Renato Bruni |
| |
Affiliation: | (1) Università di Roma “La Sapienza”-D.I.S., Via M. Buonarroti 12, Roma, 00185, Italy |
| |
Abstract: | The paper is concerned with the problem of binary classification of data records, given an already classified training set of records. Among the various approaches to the problem, the methodology of the logical analysis of data (LAD) is considered. Such approach is based on discrete mathematics, with special emphasis on Boolean functions. With respect to the standard LAD procedure, enhancements based on probability considerations are presented. In particular, the problem of the selection of the optimal support set is formulated as a weighted set covering problem. Testable statistical hypothesis are used. Accuracy of the modified LAD procedure is compared to that of the standard LAD procedure on datasets of the UCI repository. Encouraging results are obtained and discussed. |
| |
Keywords: | Classification Data mining Logical analysis of data Massive data sets Set covering |
本文献已被 SpringerLink 等数据库收录! |
|