首页 | 本学科首页   官方微博 | 高级检索  
     


A Pattern Dictionary Method for Anomaly Detection
Authors:Elyas Sabeti  Sehong Oh  Peter X. K. Song  Alfred O. Hero
Affiliation:1.Michigan Institute for Data Science, University of Michigan, Ann Arbor, MI 48109, USA;2.Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA;3.Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
Abstract:In this paper, we propose a compression-based anomaly detection method for time series and sequence data using a pattern dictionary. The proposed method is capable of learning complex patterns in a training data sequence, using these learned patterns to detect potentially anomalous patterns in a test data sequence. The proposed pattern dictionary method uses a measure of complexity of the test sequence as an anomaly score that can be used to perform stand-alone anomaly detection. We also show that when combined with a universal source coder, the proposed pattern dictionary yields a powerful atypicality detector that is equally applicable to anomaly detection. The pattern dictionary-based atypicality detector uses an anomaly score defined as the difference between the complexity of the test sequence data encoded by the trained pattern dictionary (typical) encoder and the universal (atypical) encoder, respectively. We consider two complexity measures: the number of parsed phrases in the sequence, and the length of the encoded sequence (codelength). Specializing to a particular type of universal encoder, the Tree-Structured Lempel–Ziv (LZ78), we obtain a novel non-asymptotic upper bound, in terms of the Lambert W function, on the number of distinct phrases resulting from the LZ78 parser. This non-asymptotic bound determines the range of anomaly score. As a concrete application, we illustrate the pattern dictionary framework for constructing a baseline of health against which anomalous deviations can be detected.
Keywords:pattern dictionary, atypicality, Lempel–  Ziv algorithm, lossless compression, anomaly detection
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号