Efficient Indexing and Representation of Web Access Logs
Francisco Claude, Roberto Konow, and Gonzalo Navarro
We present a space-efficient data structure, based on the Burrows-Wheeler
Transform, especially designed to handle web sequence logs, which are
needed by web usage
mining processes. Our index is able to process a set of operations
efficiently, while at the same time maintains the original
information in compressed form. Results show that web access logs can be
represented using 0.85 to 1.03 times their original (plain) size,
while executing most of the operations within a few tens of microseconds.