Approximate Regular Expression Searching with Arbitrary Integer Weights
Gonzalo Navarro
We present a bit-parallel technique to search a text of length n for a
regular expression of m symbols permitting k differences in worst case
time O(mn/log_k(s)), where s is the amount of main memory that can be
allocated. The algorithm permits arbitrary integer weights and matches the
complexity of the best previous techniques, but it is simpler and faster
in practice. In our way, we define a new recurrence for approximate searching
where the current values depend only on previous values. Interestingly, our
algorithm turns out to be a relevant option also for simple approximate string
matching with arbitrary integer weights.