A Prototype for Querying over LZCS Transformed Documents
Joaquín Adiego, Gonzalo Navarro, and Pablo de la Fuente.
We present novel query algorithms that
efficiently support some popular XPath operations over
LZCS-transformed documents. The LZCS transformation
compresses a redundant XML collection without loss. The
main idea of LZCS, inspired by Lempel-Ziv compression,
is to replace whole substructures by previous occurrences
thereof, and our algorithms try to reuse the work done
over those repeating substructures. The algorithms are
implemented in a prototype called lzcs-grep. The main
advantage of lzcs-grep is that it processes the documents in
transformed form, obtaining very fast response times in
combination with low memory requirements. Our
experimental results show that lzcs-grep is competitive
with other XPath processors even over untransformed
documents, and by far unbeaten when it can operate over
their LZCS-transformed version.