We describe a data structure for the top-k document retrieval problem that uses O(log log n) bits per symbol on top of any compressed suffix array (CSA) of D, and supports queries in essentially optimal time, in the following sense. Given a CSA using |CSA| bits of space, that finds the suffix array range of a query string P in time t_count, and accesses a suffix array entry in time t_SA, listing any k pattern occurrences would take time O(t_count +k t_SA). Our top-k data structure uses |CSA|+O(n log log n) bits and reports k most relevant documents that contain P in time O(t_count + k (t_SA + log log n)). On every known CSA using O(n log s) bits, t_SA is Omega(log log n) in virtually all cases, thus our time is O(t_count + k t_SA) in most situations.
When the query string P is sufficiently long, some CSAs reach time O(t_count + k) to list any k occurrences of P. Our structure also does better, obtaining time O(t_count + t_sort(k,n)) on every known CSA, where t_sort(k,n) is the time to sort k integers in [1,n]. This time is already O(t_count + k) in the typical regimes, k = O(polylog n) and k = Omega(n^eps) for any constant eps > 0. If we can deliver the results in unsorted order of relevance, then the time for long patterns is always O(t_count + k), which is optimal with respect to the CSA, and reaches the RAM-optimal time O(|P| / log_s n + k) on a particular CSA. No top-k solution using o(n log d) bits of space has achieved this before.