Parameterized Matching on Non-linear Structures

Amihood Amir and Gonzalo Navarro

The classical pattern matching paradigm is that of seeking occurrences of one string in another, where both strings are drawn from an alphabet set S. In the parameterized pattern matching model, a consistent renaming of symbols from S is allowed in a match. The parameterized matching paradigm has proven useful in problems in software engineering, computer vision, and other applications.

In classical pattern matching, both the text and pattern are strings. Applications such as searching in XML or searching in hypertext require searching strings in non-linear structures such as trees or graphs.

There has been work in the literature on exact and approximate parameterized matching, as well as work on exact and approximate string matching on non-linear structures. In this paper we explore parameterized matching in non-linear structures. We prove that exact parameterized matching on trees can be computed in linear time for alphabets in an O(n)-size integer range, and in time O(n log m) in general, where n is the tree size and m the pattern length. These bounds are optimal in the comparison model. We also show that exact parameterized matching on directed acyclic graphs (DAGs) is NP-complete.