Compact Representation of Web Graphs with Extended Functionality
Nieves Brisaboa, Susana Ladra, and Gonzalo Navarro
The representation of large subsets of the World Wide Web in the form of a
directed graph has been extensively used to analyze structure, behavior, and
evolution of those so-called Web graphs. However, interesting Web graphs are
very large and their classical representations do not fit into the main memory
of typical computers, whereas the required graph
algorithms perform inefficiently on secondary memory. Compressed graph
representations drastically reduce their space requirements while allowing
their efficient navigation in compressed form. While the most basic navigation
operation is to retrieve the successors of a node, several important Web graph
algorithms require support for extended queries, such as finding the
predecessors of a node, checking the presence of a link, or retrieving links
between ranges of nodes. Those are seldom supported by compressed graph
representations.
This article presents the k^2-tree, a novel Web graph representation based
on a compact tree structure that takes advantage of large empty areas of the
adjacency matrix of the graph. The representation not only retrieves
successors
and predecessors in symmetric fashion, but also it is particularly efficient
to
check for specific links between nodes, or between ranges of nodes, or
to list the links between ranges. Compared to the best representations in the
literature supporting successor and predecessor queries, our technique offers
the least space usage (1-3 bits per link) while supporting fast navigation
to predecessors and successors (2-8 microseconds per neighbor retrieved) and
sharply outperforming the others on the extended queries.
The representation is also of general interest and can be used
to compress other kinds of graphs and data structures.