Compressed Dynamic Range Majority Data Structures

Travis Gagie, Meng He, and Gonzalo Navarro

In the range a-majority query problem, we preprocess a given sequence S[1..n] for a fixed threshold a in (0, 1], such that given a query range [i..j], the symbols that occur more than a (j-i+1) times in S[i..j] can be reported efficiently. We design the first compressed solution to this problem in dynamic settings. Our data structure represents S using n H_k+ o(n lg s) bits for any k = o(log_s n), where s is the alphabet size and H_k is the k-th order empirical entropy of S. It answers range a-majority queries in O((lg n)/(a lg lg n)) time, and supports insertions and deletions in O((lg n)/a) amortized time. The best previous solution has the same query and update times, but uses O(n) words.