Probabilistic Proximity Searching Algorithms Based on Compact Partitions
Benjamin Bustos and Gonzalo Navarro
The main bottleneck of the research in metric space searching is the so-called
curse of dimensionality, which makes the task of searching some metric spaces
intrinsically difficult, whatever algorithm is used. A recent trend to break
this bottleneck resorts to probabilistic algorithms, where it has been shown
that one can find 99% of the elements at a fraction of the cost of the exact
algorithm. These algorithms are welcome in most applications because resorting
to metric space searching already involves a fuzziness in the retrieval
requirements. In this paper we push
further in this direction by developing probabilistic algorithms on data
structures whose exact versions are the best for high dimensions. As a result,
we obtain probabilistic algorithms that are better than the previous ones. We
give new insights on the problem and propose a novel view based on
time-bounded searching. We also propose an experimental framework for
probabilistic algorithms that permits comparing them in offline mode.