###
Increased Bit-Parallelism for Approximate and Multiple String Matching

####
Heikki Hyyrö, Kimmo Fredriksson, and Gonzalo Navarro

Bit-parallelism permits executing several operations simultaneously over a set
of bits or numbers stored in a single computer word. This technique permits
searching for the approximate occurrences of a pattern of length *m* in a text
of length *n* in time *O(ceil(m/w) n)*, where *w* is the number of bits
in the computer word. Although this is asymptotically the optimal bit-parallel
speedup over the basic *O(mn)* time algorithm, it wastes bit-parallelism's
power in the common case where *m* is much smaller than *w*, since *w-m* bits
in the computer words get unused.
In this paper we explore different ways to increase the bit-parallelism when
the search pattern is short. First, we show how multiple patterns can be packed
into a single computer word so as to search for all them simultaneously. Instead
of spending *O(rn)* time to search for *r* patterns of length *m<=w/2*, we
need *O(ceil(rm/w) n)* time. Second, we show how the mechanism permits
boosting the search for a single pattern of length *m<=w/2*, which can be
searched for in *O(ceil(n/floor(w/m)))* bit-parallel steps
instead of *O(n)*. Third, we show how to extend these algorithms so that the
time bounds essentially depend on *k* instead of *m*, where *k* is the maximum
number of differences permitted. Finally, we show how the ideas can be applied
to other problems such as multiple exact string matching and one-against-all
computation of edit distance and longest common subsequences.

Our experimental results show that the new algorithms work well in practice,
obtaining significant speedups over the best existing alternatives especially
on short patterns and moderate number of differences allowed. This work fills
an important gap in the field, where little work has focused on very short
patterns.