A Grammar Compression Algorithm based on Induced Suffix Sorting
Daniel Saad, Felipe Louza, Simon Gog, Mauricio Ayala-Rincón, and
Gonzalo Navarro
We introduce GCIS, a grammar compression algorithm based on the induced suffix
sorting
algorithm SAIS, presented by Nong et al. in 2009. Our solution builds on the
factorization
performed by SAIS during suffix sorting. We construct a context-free grammar
on the input
string which can be further reduced into a shorter string by substituting each
substring by
its corresponding factor. The resulting grammar is encoded by exploring some
redundancies,
such as common prefixes between suffix rules, which are sorted according to
SAIS framework.
When compared to well-known compression tools such as Re-Pair and 7-zip under
repetitive
sequences, our algorithm is faster at compressing and achieves compression
ratio close to
that of Re-Pair, at the cost of being the slowest at decompressing.