Using the k-Nearest Neighbor Graph for Proximity Searching in Metric Spaces

Rodrigo Paredes and Edgar Chávez

Proximity searching consists in retrieving from a database, objects that are close to a query. For this type of searching problem, the most general model is the metric space, where proximity is defined in terms of a distance function. A solution for this problem consists in building an offline index to quickly satisfy online queries. The ultimate goal is to use as few distance computations as possible to satisfy queries, since the distance is considered expensive to compute. Proximity searching is central to several applications, ranging from multimedia indexing and querying to data compression and clustering.

In this paper we present a new approach to solve the proximity searching problem. Our solution is based on indexing the database with the k-nearest neighbor graph (kNNG), which is a directed graph connecting each element to its k closest neighbors.

We present two search algorithms for both range and nearest neighbor queries which use navigational and metrical features of the kNNG graph. We show that our approach is competitive against current ones. For instance, in the document metric space our nearest neighbor search algorithms perform 30% more distance evaluations than AESA using only a 0.25% of its space requirement. In the same space, the pivot-based technique is completely useless.