首页 | 本学科首页   官方微博 | 高级检索  
     


Space Efficient Suffix Trees
Authors:J. Ian Munro   Venkatesh Raman  S. Srinivasa Rao
Affiliation:Department of Computer Science, University of Waterloo, Waterloo, Canada, N2L 3G1f1;b The Institute of Mathematical Sciences, Chennai, India, 600 113
Abstract:We give first the representation of a suffix tree that uses n lg n + O(n) bits of space and supports searching for a pattern string in the given text (from a fixed size alphabet) in O(m) time, where n is the size of the text and m is the length of the pattern. The structure is quite simple and answers a question raised by Muthukrishnan (in Proceedings of the FST and TCS Preconference Workshop on Randomization, 1997, pp. 23–27). Previous compact representations of suffix trees had either a higher lower order term in space and had some expectation assumption or required more time for searching. When the size of the alphabet k is not viewed as a constant, this structure can be modified to use the same space but take O(m lg k) time for string searching or to use an additional O(n lg k) bits but take the same O(m) time for searching. We then give several index structures for binary texts, with less space including
• a structure that uses a suffix array (left ceilinglg right ceiling bits) and an additional () bits,
• an indexing structure that takes bits of space, and
• an ( lg ) bit structure which answers in () time, the decision question of whether a given pattern of length occurs in the text.
Each of these structures uses a different technique, either in the storage scheme or in the search algorithm, in reducing the space requirement. The first one uses a suffix array, a sparse suffix tree, and a table structure. Finding all the occurrences of a pattern using this structure takes O(m + s) time, where s is the number of occurrences of the pattern in the text. The second structure constructs a sparse suffix tree for all the suffixes that start with the bit that occurs more times in the given binary text. The last structure uses an iterative algorithm to search for the pattern. This structure is the first o(n lg n) bit index to support the decision version of indexing queries in time linear in the length of the pattern. But this does not support the general indexing queries where we want to find the position of the occurrence of the pattern.Our main contribution is the development of techniques to use the succinct tree representation through balanced parentheses for suffix trees.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号