Dv2v: A Dynamic Variable-to-Variable Compressor
Nieves Brisaboa, Antonio Fariña, Adrián
Gómez-Brandón, Gonzalo Navarro, and Tirso Rodeiro
We present D-v2v, a new dynamic (one-pass) variable-to-variable compressor.
Variable-to-variable compression aims at using a modeler that gathers variable-length input symbols and a variable-length statistical coder that assigns shorter codewords to the more frequent symbols.
In D-v2v, we process the input text word-wise to gather variable-length symbols that can be either terminals (new words) or non-terminals, subsequences of words seen before in the input text. Those input symbols are set in a vocabulary that is kept sorted by frequency. Therefore, those symbols can be easily encoded with dense codes. Our D-v2v permits real-time transmission of data, i.e. compression/transmission can begin as soon as data become available.
Our experiments show that D-v2v is able to overcome the compression
ratios of the v2vDC, the state-of-the-art semi-static variable-to-variable
compressor, and to almost reach p7zip values. It also draws a competitive performance at both compression and decompression.