TUTORIAL
Presenters: Franck Cappello, Peter Lindstrom
Time: Monday, November 12th, 1:30pm – 5pm
Location: C141
Description: Large-scale numerical simulations and experiments generate very large datasets that are difficult to analyze, store, and transfer. This problem will be exacerbated for future generations of systems. Data reduction becomes a necessity in order to reduce as much as possible the time lost in data transfer and storage. Data compression is an attractive and efficient reduction technique that is rather agnostic to the application. This tutorial will introduce motivating examples, basic compression techniques, state of the art data transformation, prediction, quantization, and coding techniques; discuss in detail the SZ and ZFP lossy compressors (including their latest developments); introduce compression error assessment metrics; and show how lossy compression impacts visualization and other data analytics. The tutorial will also cover Z-checker, a tool to characterize data sets and assess compression error. The tutorial will use examples of real world compressors and datasets coming from simulations and instruments to illustrate the different compression techniques, their performance and their impact. From a user perspective, the tutorial will detail how to use ZFP, SZ and Z-checker as stand alone software/libraries and as modules integrated in parallel I/O libraries (ADIOS, HDF5).