New and Faster Filters for Multiple Approximate String Matching
Ricardo Baeza-Yates and Gonzalo Navarro
We present three new algorithms for on-line multiple string matching allowing
errors.
These are extensions of previous algorithms that search for a single pattern.
The average running time achieved is in all cases linear in the text size
for moderate error level, pattern length and number of patterns.
They adapt (with higher costs) to the other cases.
However, the algorithms differ in speed and thresholds of usefulness.
We analyze theoretically when each algorithm should be used, and show
experimentally their performance.
The only previous solution for this problem allows only one error.
Our algorithms are the first to allow more errors, and are faster
than previous work for a moderate number of patterns (e.g. less than 50-100 on
English text, depending on the pattern length).