LZ77-like Compression with Fast Random Access
Sebastián Kreft and Gonzalo Navarro
We introduce an alternative Lempel-Ziv text parsing, LZ-End, that converges
to the entropy and in practice gets very close to LZ77. LZ-End forces
sources to finish at the end of a previous phrase. Most Lempel-Ziv parsings
can decompress the text only from the beginning. LZ-End is the only parsing we
know of able of decompressing arbitrary phrases in optimal time, while staying
closely competitive with LZ77, especially on highly repetitive collections,
where LZ77 excells. Thus LZ-End is ideal as a compression format for highly
repetitive sequence databases, where access to individual sequences is
required,
and it also opens the door to compressed indexing schemes for such
collections.