###
New Models and Algorithms for Multidimensional Approximate Pattern Matching

####
Ricardo Baeza-Yates and Gonzalo Navarro

We focus on how to compute the edit distance (or similarity)
between two images and the problem of approximate string matching in two
dimensions, that is, to find a pattern of size *m x m* in a text of size
*n x n* with at most *k* errors (character substitutions, insertions and
deletions). Pattern and text are matrices over an alphabet of size *sigma*.
We present new models and give the first sublinear time search algorithms
for the new and the existing models.
The only existing model just considers errors along one dimension. The
associated approximate search algorithms use dynamic programming and are
relatively expensive (*O(m^2*n^2)* or *O(k^2*n^2)*). For this model we present
a filtering algorithm which avoids verifying most of the text with dynamic
programming. This filter is based on one-dimensional multipattern
approximate searching. The average complexity of our resulting algorithm is
*O(n^2*k*log_sigma(m)/m^2)* for *k < m*(m+1)/(5*log_sigma(m))*, which is
optimal and matches the best previous result that allows only substitutions.
We present other slower filtration algorithms that however work for higher
error levels.

We then consider more general models to compare images. We present new
similarity measures and the algorithms to compute them. We then focus on
one of the models, which allows the errors to occur along any dimension,
and extend it to the general case where pattern and text are *d*-dimensional.
This edit distance can be computed in *O(d!m^(2d))* time and *O(d!m^(2d-1))*
space. We also present the first sublinear-time (on average) searching
algorithm (i.e. not all text cells are inspected), which is
*O(k*n^d/m^(d-1))* time for *k < (m/(d(log_sigma(m/d))))^(d-1)*.