Multiple Approximate String Matching by Counting
Gonzalo Navarro
We present a very simple and efficient algorithm for online multiple approximate
string matching.
It uses a previously known counting-based filter that searches
for a single pattern by quickly discarding uninteresting parts of the text.
Our multi-pattern algorithm is based on the simulation of many parallel filters
using bits of the computer word.
Our average complexity to search r patterns of length m is
O (rnlog m / log n), being n is the text size.
We can search patterns of different length, each one with a different number
of errors.
We show experimentally that our algorithm is competitive with the fastest known
algorithms, being the fastest for a wide range of intermediate error ratios.
We give the first average-case analysis of the filtering efficiency of the
counting method, applicable also to single patterns.