首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Frequency-based views to pattern collections
Authors:Taneli Mielikäinen
Institution:HIIT Basic Research Unit, Department of Computer Science, University of Helsinki, P.O. Box 68 (Gustaf Hällströmin katu 2b), FIN-00014, Finland
Abstract:Finding interesting patterns from data is one of the most important problems in data mining and it has been studied actively for more than a decade. However, it is still largely open problem which patterns are interesting and which are not.The problem of detecting the interesting patterns (in a predefined class of patterns) has been attempted to solve by determining quality values for potentially interesting patterns and deciding a pattern to be interesting if its quality value (i.e., the interestingness of the pattern) is higher than a given threshold value. Again, it is very difficult to find a threshold value and a way to determine the quality values such that the collection of patterns with quality values greater than the threshold value would contain almost all truly interesting patterns and only few uninteresting ones.To enable more accurate characterization of interesting patterns, use of constraints to further prune the pattern collection has been proposed. However, most of the constrained pattern discovery research has been focused on structural constraints for the pattern collections and the patterns. We take a complementary approach and focus on constraining the quality values of the patterns.We propose quality value simplifications as a complementary approach to structural constraints on patterns. As a special case of the quality value simplifications, we consider discretizing the quality values. We analyze the worst-case error of certain discretization functions and give efficient discretization algorithms minimizing several loss functions. In addition to that, we show that the discretizations of the quality values can be used to obtain small approximate condensed representations for collections of interesting patterns. We evaluate the proposed condensation approach experimentally using frequent itemsets.
Keywords:Data mining  Pattern discovery  Condensed representations of pattern collections
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号