The Wavelet Matrix: An Efficient Wavelet Tree for Large Alphabets
Francisco Claude, Gonzalo Navarro, and Alberto Ordóñez
The wavelet tree is a flexible data structure that permits representing
sequences S[1,n] of symbols over an alphabet of size s, within
compressed space and supporting a wide range of operations on S. When
s is significant compared to n, current wavelet tree representations
incur in noticeable space or time overheads. In this article we introduce
the wavelet matrix, an alternative representation for large alphabets
that retains all the properties of wavelet trees but is significantly
faster. We also show how
the wavelet matrix can be compressed up to the zero-order entropy of the
sequence without sacrificing, and actually improving, its time performance.
Our experimental results show that the wavelet matrix outperforms all the
wavelet tree variants along the space/time tradeoff map.