Probabilistic Proximity Search: Fighting the Curse of Dimensionality in Metric Spaces.

Edgar Chávez and Gonzalo Navarro

Proximity searches become very difficult on "high dimensional" metric spaces, that is, those whose histogram of distances has a large mean and/or a small variance. This so-called "curse of dimensionality", well known in vector spaces, is also observed in metric spaces. The search complexity grows sharply with the dimension and with the search radius. We present a general probabilistic framework applicable to any search algorithm and whose net effect is to reduce the search radius. The higher the dimension, the more effective the technique. We illustrate empirically its practical performance on a particular class of algorithms, where large improvements in the search time are obtained at the cost of a very small error probability.