首页 | 本学科首页   官方微博 | 高级检索  
     


Logical analysis of numerical data
Authors:Endre Boros  Peter L. Hammer  Toshihide Ibaraki  Alexander Kogan
Affiliation:(1) RUTCOR, Rutgers University, P.O. Box 5062, 08903 New Brunswick, NJ, USA;(2) Department of Applied Mathematics and Physics, Graduate School of Engineering, Kyoto University, 606 Kyoto, Japan;(3) Department of Accounting and Information Systems, Faculty of Management, Rutgers University, 07102 Newark, NJ, USA
Abstract:“Logical analysis of data” (LAD) is a methodology developed since the late eighties, aimed at discovering hidden structural information in data sets. LAD was originally developed for analyzing binary data by using the theory of partially defined Boolean functions. An extension of LAD for the analysis of numerical data sets is achieved through the process of “binarization” consisting in the replacement of each numerical variable by binary “indicator” variables, each showing whether the value of the original variable is above or below a certain level. Binarization was successfully applied to the analysis of a variety of real life data sets. This paper develops the theoretical foundations of the binarization process studying the combinatorial optimization problems related to the minimization of the number of binary variables. To provide an algorithmic framework for the practical solution of such problems, we construct compact linear integer programming formulations of them. We develop polynomial time algorithms for some of these minimization problems, and prove NP-hardness of others. The authors gratefully acknowledge the partial support by the Office of Naval Research (grants N00014-92-J1375 and N00014-92-J4083).
Keywords:Data analysis  Boolean functions  Machine learning  Binarization  Set covering  Monotonicity  Thresholdness  Computational complexity
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号