List of Clustered Permutations for Proximity Searching
Karina Figueroa and Rodrigo Paredes
The permutation based algorithm has been proved unbeatable
in high dimensional spaces, requiring O(|P|) distance
evaluations when solving similarity queries (where P is the
set of permutants); but needs n evaluations of the permutant
distance to compute the order to review the metric dataset, requires
O(n|P|) space, and does not take much benefit from low
dimensionality. There have been several proposals to avoid the
n computations of the permutant distance, however all of them
lose precision. Inspired in the list of cluster, in this paper
we group the permutations and establish a criterion to discard whole
clusters according the permutation of their centers. As a consequence
of our proposal, we now reduce not only the space of the index and the
number of distance evaluations but also the cpu time required when
comparing the permutations themselves. Also, we can use the permutations
in low dimensions.