NR-grep: a Fast and Flexible Pattern Matching Tool.
Gonzalo Navarro
We present nrgrep ("nondeterministic reverse grep"), a new
pattern matching tool designed for efficient search of complex patterns.
Unlike previous tools of the grep family, such as agrep and
Gnu grep, nrgrep is based on a single and uniform concept:
the bit-parallel simulation of a nondeterministic suffix automaton.
As a result, nrgrep can search from simple patterns to regular
expressions, exactly or allowing errors in the matches, with an efficiency
that degrades smoothly as the complexity of the search pattern increases.
Another concept fully integrated into nrgrep and that contributes to
this smoothness is the selection of adequate subpatterns for fast scanning,
which is also absent in many current tools. We show that the efficiency of
nrgrep is similar to that of the fastest existing string matching
tools for the simplest patterns, and by far unpaired for more complex patterns.