Pivot Selection Techniques for Proximity Searching in Metric Spaces

Benjamin Bustos, Gonzalo Navarro and Edgar Chávez.

With a few exceptions, proximity search algorithms in metric spaces based on the use of pivots select them at random among the elements of the metric space. However, it is well known that the way in which the pivots are selected can affect the performance of the algorithm. Between two sets of pivots of the same size, better chosen pivots can reduce the search time. Alternatively, a better chosen small set of pivots (requiring less space) can yield the same efficiency as a larger, randomly chosen, set. We propose an efficiency measure to compare two pivot sets, combined with an optimization technique that allows selecting good sets of pivots. We obtain abundant empirical evidence showing that our technique is effective. We also show that good pivots are outliers, but selecting outliers does not ensure that good pivots are selected.