Clustering-based Compression for Raster Time Series

Martita Muñoz, José Fuentes-Sepúlveda, Cecilia Hernández, Gonzalo Navarro, Diego Seco, and Fernando Silva-Coira.

A raster time series is a sequence of independent rasters arranged chronologically covering the same geographical area. These are commonly used to depict the temporal evolution of represented variables. The T-k^2-raster is a compact data structure that performs very well in practice for compact representations for raster time series. This structure classifies each raster as a snapshot or a log and encodes logs concerning their reference snapshots, which are the immediately preceding selected snapshots. An enhanced version of the T-k^2-raster, called Heuristic T-k^2-raster, incorporates a heuristic for automating the selection of snapshots.

In this study, we investigate the optimality of the heuristic employed in Heuristic T-k^2-raster by comparing it with a dynamic programming approach. Our experimental evaluation demonstrates that Heuristic T-k^2-raster is a near-optimal solution, achieving compression performance almost identical to the dynamic programming method. These results indicate that variations of the structure that maintain the temporal order of the rasters are unlikely to significantly improve compression. Consequently, we explore an alternative approach based on clustering, where rasters are grouped according to their similarity, regardless of their temporal order. Our experimental evaluation reveals that this clustering-based strategy can enhance compression in scenarios characterized by cyclic behavior.