Probabilistic Proximity Searching Algorithms Based on Compact Partitions

Benjamin Bustos and Gonzalo Navarro

The main bottleneck of the research in metric space searching is the so-called curse of dimensionality, which makes the task of searching some metric spaces intrinsically difficult, whatever algorithm is used. A recent trend to break this bottleneck resorts to probabilistic algorithms, where it has been shown that one can find 99% of the elements at a fraction of the cost of the exact algorithm. These algorithms are welcome in most applications because resorting to metric space searching already involves a fuzziness in the retrieval requirements. In this paper we push further in this direction by developing probabilistic algorithms on data structures whose exact versions are the best for high dimensions. As a result, we obtain probabilistic algorithms that are better than the previous ones. We give new insights on the problem and propose a novel view based on time-bounded searching. We also propose an experimental framework for probabilistic algorithms that permits comparing them in offline mode.