Asymptotically Optimal Encodings of Range Data Structures for Selection and Top-k Queries

Roberto Grossi, John Iacono, Gonzalo Navarro, Rajeev Raman, and S. Rao Satti.

Given an array A[1,n] of elements with a total order, we consider the problem of building a data structure that solves two queries: (a) selection queries receive a range [i,j] and an integer k and return the position of the kth largest element in A[i,j]; (b) top-k queries receive [i,j] and k and return the positions of the k largest elements in A[i,j]. These problems can be solved in optimal time, O(1 + log k / log log n) and O(k), respectively, using linear-space data structures. We provide the first study of the encoding data structures for the above problems, where A cannot be accessed at query time. Several applications are interested in the relative order of the entries of A, and their positions, rather their actual values, and thus we do not need to keep A at query time. In those cases, encodings save storage space: we first show that any encoding answering such queries requires n lg k - O(n + k lg k) bits of space; then, we design encodings using O(n log k) bits, that is, asymptotically optimal up to constant factors, while preserving optimal query time.