Trees are ubiquitous in Computer Science. They have applications in every aspect of computing from XML/HTML processing to abstract syntax trees (AST) in compilers, phylogenetic trees in computational genomics or shortest path trees in path planning. The ever increasing amounts of structured, hierarchical data processed in many applications have turned the processing of the corresponding large tree structures into a bottleneck, particularly when they do not fit in memory. Succinct tree representations store trees using as few bits as possible and thereby significantly increase the size of trees that fit in memory while still supporting important primitive operations in constant time. There exist such representations that use only $2n+o\left(n\right)$ bits to store the topology of a tree with n nodes, which is close to the information-theoretic lower bound and much less than the space used by traditional pointer-based representations.

The construction of succinct trees is quite slow compared to the construction of pointer-based representations. Multicore parallelism offers one possible tool to speed up the construction of succinct trees, but little work has been done in this direction so far. The only results we are aware of focus on the construction of Wavelet trees, which are used in representations of text indexes.

In this work, we provide a parallel algorithm that constructs the *Range Min-Max Tree*
representation of [1] in $O(\frac{n}{\mathrm{p}}+\mathrm{lg}p)$ time using $\mathrm{p}$ cores. This structure uses
$2n+o\left(n\right)$ bits to store an ordinal tree with n nodes and supports a rich set of basic
operations on these trees in $\mathrm{lg}n$ time. While this query time is theoretically
suboptimal, the *Range Min-Max Tree* structure is simple enough to be practical and has been
verified experimentally to be very small and support fast queries in practice [2].
Combined with the fast parallel construction algorithm presented in this paper, it
provides an excellent tool for manipulating very large trees in many applications.
We implemented and tested our algorithm on a number of real-world input
trees having billions of nodes. Our experiments show that our algorithm run on
a single core is competitive with state-of-the-art sequential constructions of the
*Range Min-Max Tree* structure and achieves good speed-up on up to $64$ cores and likely beyond.

An author's version of our work can be downloaded here. The link to the official version is here.

The code for this work is available at github.

- wiki: The Wikipedia dump (January 12, 2015). This input has 498,753,914 parentheses.
- prot: Suffix tree of the protein data from the Pizza&Chili corpus. The suffix tree was constructed using this code. This input has 670,721,006 parentheses.
- dna: Suffix tree of the DNA data from the Pizza&Chili corpus. The suffix tree was constructed using this code. This input has 1,154,482,174 parentheses.
- ctree: Complete binary tree of depth 30. This input has 2,147,483,644 parentheses.
- osm: The OpenStreetMap dump (January 10, 2015). This input has 4,675,776,358 parentheses.

p |
wiki |
prot |
dna |
ctree |
osm |
---|---|---|---|---|---|

libcds |
33.16 | 44.24 | 75.87 | 140.41 | 339.21 |

sdsl |
1.93 | 2.66 | 4.57 | 8.35 | 18.10 |

1 | 2.89 | 4.22 | 7.21 | 12.16 | 30.60 |

2 | 1.44 | 2.13 | 3.64 | 6.15 | 15.43 |

4 | 0.73 | 1.10 | 1.87 | 3.18 | 7.98 |

8 | 0.37 | 0.57 | 0.98 | 1.59 | 4.14 |

12 | 0.31 | 0.44 | 0.72 | 1.15 | 3.07 |

16 | 0.25 | 0.35 | 0.58 | 0.86 | 2.21 |

20 | 0.23 | 0.32 | 0.55 | 0.76 | 1.83 |

24 | 0.21 | 0.28 | 0.46 | 0.69 | 1.72 |

28 | 0.18 | 0.28 | 0.45 | 0.63 | 1.55 |

32 | 0.18 | 0.25 | 0.39 | 0.63 | 1.33 |

36 | 0.16 | 0.25 | 0.39 | 0.55 | 1.33 |

40 | 0.18 | 0.26 | 0.35 | 0.51 | 1.27 |

44 | 0.17 | 0.27 | 0.35 | 0.49 | 1.14 |

48 | 0.22 | 0.28 | 0.38 | 0.53 | 1.21 |

52 | 0.24 | 0.26 | 0.33 | 0.46 | 1.09 |

56 | 0.21 | 0.28 | 0.39 | 0.49 | 1.02 |

60 | 0.28 | 0.29 | 0.39 | 0.46 | 1.04 |

64 | 0.27 | 0.29 | 0.39 | 0.48 | 1.01 |

- [1] Navarro, G., Sadakane, K.: Fully functional static and dynamic succinct trees. ACM Trans. Algorithms 10(3), 16:1–16:39 (May 2014)
- [2] Arroyuelo, D., Cánovas, R., Navarro, G., Sadakane, K.: Succinct trees in practice. In: ALENEX. pp. 84–97. SIAM Press, Austin, Texas, USA (2010)

This work was supported the doctoral scholarship of CONICYT (21120974) and in part by the Emerging Leaders in the Americas scholarship programme

Contact us at: jfuentess@udec.cl

We would like to thank to
Free DNS and
HighCharts for their services