A Compressed Text Index on Secondary Memory
Rodrigo González and Gonzalo Navarro.
We introduce a practical disk-based compressed text index that, when the text
is compressible, takes much less space than the suffix array.
It provides good I/O times for searching, which in particular improve
when the text is compressible. In this aspect our index is unique, as
most compressed indexes are slower than their classical counterparts on
secondary memory. We analyze our index and show experimentally that it is
extremely competitive on compressible texts. As a side contribution, we
introduce a simple encoding of sequences that achieves high-order compression
and provides constant-time random access, both in main and secondary memory.