A Partial Deterministic Automaton for Approximate String Matching

Gonzalo Navarro

One of the simplest approaches to approximate string matching is to consider the associated non-deterministic finite automaton and make it deterministic. Besides automaton generation, the search time is O (n) in the worst case, where n is the text size. This solution is mentioned in the classical literature but has not been further pursued, due to the large number of automaton states that may be generated.

We study the idea of generating the deterministic automaton on the fly. That is, we only generate the states that are actually reached when the text is traversed. We show that this limits drastically the number of states actually generated. Moreover, the algorithm is competitive, being the fastest one for intermediate error ratios and pattern lengths.