Range Majorities and Minorities in Arrays
Djamal Belazzougui, Travis Gagie, J. Ian Munro, Gonzalo Navarro, and
Yakov Nekrich
The problem of parameterized range
majority asks us to preprocess a string of length n such that, given
the endpoints of a range, one can quickly find all the distinct elements whose
relative frequencies in that range are more than a threshold t. This is
a more tractable version of the classical problem of finding the range mode,
which is unlikely to be solvable in polylogarithmic time and linear space.
In this paper we give the first linear-space solution with optimal O(1 /
t) query time, even when t can be specified with the query.
We then consider data structures whose space is bounded by the
entropy of the distribution of the symbols in the sequence.
For the case when the alphabet size s is polynomial on the computer word
size, we retain the optimal time within optimally compressed space (i.e., with
sublinear redundancy). Otherwise, either the compressed space is increased by
an arbitrarily small constant factor or the time rises to any function in
(1/t) * omega(1). We obtain the same results on the complementary problem of parameterized range minority.