Tree Path Majority Data Structures
Travis Gagie, Meng He, Gonzalo Navarro, and Carlos Ochoa
We present the first solution to finding t-majorities on tree paths. Given a
tree of n nodes, each with a label from [1..s], and a fixed threshold
0<t<1, such a query gives two nodes u and v and asks for all the
labels that appear more than t |P_uv| times in the path
P_uv from u to v, where |P_uv| denotes the number of nodes in P_uv.
Note that the answer to any query is of size up to 1/t.
On a w-bit RAM, we obtain a linear-space data structure
with O((1/t) log log_w s) query time, which is
worst-case optimal for polylogarithmic-sized alphabets.
We also describe two succinct-space solutions with query time
O((1/t) log* n log log_w s).
One uses 2nH + 4n + o(n)(H+1) bits,
where H ≤ lg s is the entropy of the label distribution; the
other uses nH + O(n) + o(nH) bits.
By using just o(n log s) extra bits, our succinct structures allow
t to be specified at query time.
We obtain analogous results to find a t-minority, that is, an element
that appears between 1 and t |P_uv| times in P_uv.