Average-Optimal Multiple Approximate String Matching
Kimmo Fredriksson and Gonzalo Navarro
We present a new algorithm for multiple approximate string matching,
based on an extension of the optimal (on average) single-pattern
approximate string matching algorithm of Chang and Marr. Our algorithm
inherits the optimality and is also competitive in practice. We
present a second algorithm that is linear time and handles higher
difference ratios. We show experimentally that our algorithms are the
fastest for intermediate difference ratios, an area where the only
existing algorithms permitted simultaneous search for just a few
patterns. Our algorithm is also resistant to the number of patterns,
being effective for hundreds of patterns. Hence we fill an important
gap in approximate string matching techniques, since no effective
algorithms existed to search for many patterns with an intermediate
difference ratio.