Compressed Dynamic Range Majority and Minority Data Structures
Travis Gagie, Meng He, and Gonzalo Navarro
In the range a-majority query problem, we are given a sequence
S[1..n] and a fixed threshold a in (0,1), and are asked to
preprocess S such that, given a query range [i..j], we can
efficiently report the symbols that occur more than a (j-i+1) times in
S[i..j], which are called the range a-majorities. In this article
we describe the first compressed dynamic data structure for range
a-majority queries. It represents S in compressed space ---
n H_k + o(n lg s) bits for any k = o(lg_s n), where s is
the alphabet size and H_k ≤ H_0 ≤ lg s is the kth order
empirical entropy of S --- and answers queries in O((lg n) / (a
lg lg n)) time while supporting insertions and deletions in S in
O((lg n) / a) amortized time. We then show how to modify our data
structure to receive some b ≥ a at query time and report the range
b-majorities in O((lg n) / (b lg lg n)) time, without increasing
the asymptotic space or update-time bounds. The best previous dynamic solution
has the same query and update times as ours, but it occupies O(n) words
and cannot take advantage of being given a larger threshold b at query
time. We also design the first dynamic data structure for range
a-minority --- i.e., find a non-a-majority that occurs in a
range --- and obtain space and time bounds similar to those for
a-majorities. We extend the structure to find Theta(1/a)
a-minorities at the same space and time cost. By giving up updates, we
obtain static data structures with query time O((1/a) lg lg_w s)
for both problems, on a RAM with word size w = Omega(lg n) bits, without
increasing our space bound. Static alternatives reach time O(1/a),
but they compress S only to zeroth order entropy (H_0) or they
only handle small values of a, that is, lg(1/a) = o(lg s).