Engineering Practical Lempel-Ziv Tries
Diego Arroyuelo, Rodrigo Cánovas, Johannes Fischer, Dominik
Köppl, Marvin Löbel, Gonzalo Navarro, and Rajeev Raman
The Lempel-Ziv 78 (LZ78) and Lempel-Ziv-Welch (LZW) text factorizations are popular, not
only for bare compression but also for building compressed data structures on
top of them.
Their regular factor structure makes them computable within space
bounded by the compressed output size. In this paper we carry out the first
thorough study of low-memory LZ78 and LZW text factorization algorithms,
introducing more efficient alternatives to the classical methods, as well as
new techniques that can run within less
memory space than the necessary to hold the compressed file. Our results build
on hash-based representations of tries that may have independent interest.