Average Complexity of Exact and Approximate Multiple String Matching
Gonzalo Navarro and Kimmo Fredriksson
We show that the average number of characters examined to search for r
random patterns of length m in a text of length n over a uniformly
distributed alphabet of size s cannot be less than
Omega(n*log_s(rm)/m). When we permit up to k
insertions, deletions, and/or substitutions of characters in the occurrences
of the patterns, the lower bound becomes Omega(n(k+log_s(rm))/m).
This generalizes previous single-pattern lower bounds of Yao (for exact
matching) and of Chang and Marr (for approximate matching), and proves the
optimality of several existing multipattern search algorithms.