Grammar Compressed Sequences with Rank/Select Support
Alberto Ordóñez, Gonzalo Navarro, and Nieves Brisaboa
Sequence representations supporting not only direct access to their symbols,
but also rank/select operations, are a fundamental building block in many
compressed data structures. Several recent applications need to represent
highly repetitive sequences, and classical statistical compression proves
ineffective. We introduce, instead, grammar-based representations for
repetitive sequences, which use up to 6% of the space needed by
statistically compressed representations, and support direct access and
rank/select operations within tens of microseconds. We demonstrate the impact
of our structures in text indexing applications.