In this paper, we present a new semistatic compressor, called Pair-Based End-Tagged Dense Code (PETDC). PETDC compresses English texts to 27-28%, overcoming the optimal 0-order prefix-free semistatic compressor (Plain Huffman) in more than 3 percentage points. Moreover, PETDC permits also random decompression, and direct searches using fast Boyer-Moore algorithms.
PETDC builds a vocabulary with both words and pairs of words. Since each symbol in the vocabulary is given a codeword, compression is improved by replacing two words of the source text by a unique codeword.