Indexing text using the Ziv–Lempel trie |
| |
Authors: | Gonzalo Navarro |
| |
Institution: | Department of Computer Science, Univ. of Chile, Blanco Encalada 2120, Santiago, Chile |
| |
Abstract: | Let a text of u characters over an alphabet of size σ be compressible to n phrases by the LZ78 algorithm. We show how to build a data structure based on the Ziv–Lempel trie, called the LZ-index, that takes 4nlog2n(1+o(1)) bits of space (that is, 4 times the entropy of the text for ergodic sources) and reports the R occurrences of a pattern of length m in worst case time O(m3logσ+(m+R)logn). We present a practical implementation of the LZ-index, which is faster than current alternatives when we take into consideration the time to report the positions or text contexts of the occurrences found. |
| |
Keywords: | Author Keywords: Self-indexing Compressed text Compressed pattern matching Succinct data structures |
本文献已被 ScienceDirect 等数据库收录! |
|