Proximal Nodes: A Model to Query Document Databases by Contents and Structure
Gonzalo Navarro and Ricardo Baeza-Yates
A model to query document databases by both their content and structure
is presented. The goal is to obtain a query language
which is expressive in practice while being efficiently implementable,
features not present at the same time in previous work.
The key ideas of the model are a set-oriented query language based on
operations on nearby structure elements of one or more hierarchies,
together with content and structural indexing and bottom-up evaluation.
The model is evaluated regarding expressiveness and efficiency, showing
that it provides a good trade-off between both goals. Finally, it is shown
how to include in the model other media different from text.