Approximate direct and reverse nearest neighbor queries, and the k-nearest neighbor graph
Karina Figueroa and Rodrigo Paredes
Retrieving the k-nearest neighbors of a query object
is a basic primitive in similarity searching. A related, far less
explored primitive is to obtain the dataset elements which would have
the query object within their own k-nearest neighbors, known as the
reverse k-nearest neighbor query. We already have indices and
algorithms to solve k-nearest neighbors queries in general metric
spaces; yet, in many cases of practical interest they degenerate to sequential
scanning. The naive algorithm for reverse k-nearest neighbor
queries has quadratic complexity, because k-nearest neighbors of
all the dataset objects must be found; this is too expensive.
Hence, when solving these primitives we can tolerate
trading correctness in the solution for searching time.
In this paper we propose an efficient approximate approach to solve these
similarity queries with high retrieval rate.
Then, we show how to use our approximate k-nearest neighbor
queries to construct (an approximation of) the k-nearest
neighbor graph when we have a fixed dataset. Finally, combining both
primitives we show how to dynamically maintain the approximate
k-nearest neighbor graph of the objects currently stored within the
metric dataset, that is, considering both object insertions and deletions.