###
Succinct Suffix Arrays based on Run-Length Encoding.

####
Veli Mäkinen and Gonzalo Navarro

A succinct full-text self-index is a data structure built on a text
*T=t_1 t_2 ... t_n*, which takes little space (ideally close to that of the
compressed text), permits efficient search for the occurrences of a pattern
*P=p_1 p_2 ... p_m* in *T*, and is able to reproduce any text substring, so
the self-index replaces the text.
Several remarkable self-indexes have been developed in recent years. They
usually take space proportional to *n H_0* or *n H_k* bits, where *H_k* is the
*k*th order empirical entropy of *T*. The time to count how many times does
*P* occur in *T* ranges from *O(m)* to *O(m log n)*.
In this paper we present a new self-index, called RLFM index for "run-length
FM-index", that counts the occurrences of *P* in *T* in *O(m)* time when the
alphabet size is *sigma=O(polylog(n))*. The RLFM index requires
*n H_k log_2(sigma)+O(n)* bits of space for small *k*.
We then show how to implement the RLFM index in practice, and obtain in passing
another implementation with different space-time tradeoffs. We empirically
compare ours against the best existing implementations of other indexes and
show that ours are fastest among indexes taking less space than the text.