Overcoming the Curse of Dimensionality

Edgar Chávez, José Luis Marroquín and Gonzalo Navarro

We study the behavior of pivot-based algorithms for similarity searching in metric spaces. We show that they are effective tools for intrinsically high-dimensional spaces, and that their performance is basically dependent on the number of pivots used and the precision used to store the distances. Our goal is to give a simple yet effective recipe for practitioners seeking for a black-box method to plug in their applications. In passing, we introduce a new method to reduce the overall CPU search time, independently of the number of distance evaluations.