Page tree

Configuring filters that process data during I/O operation (H5Z)

HDF5 supports a filter pipeline that provides the capability for standard and customized raw data processing during I/O operations. HDF5 is distributed with a small set of standard filters such as compression (gzip, SZIP, and a shuffling algorithm) and error checking (Fletcher32 checksum). For further flexibility, the library allows a user application to extend the pipeline through the creation and registration of customized filters.

The flexibility of the filter pipeline implementation enables the definition of additional filters by a user application. A filter

  • is associated with a dataset when the dataset is created,
  • can be used only with chunked data 
    (i.e., datasets stored in the H5D_CHUNKED storage layout), and
  • is applied independently to each chunk of the dataset.

The HDF5 library does not support filters for contiguous datasets because of the difficulty of implementing random access for partial I/O. Compact dataset filters are not supported because it would not produce significant results.

Filter identifiers for the filters distributed with the HDF5 Library are as follows:

H5Z_FILTER_DEFLATEThe gzip compression, or deflation, filter
H5Z_FILTER_SZIPThe SZIP compression filter
H5Z_FILTER_NBITThe N-bit compression filter
H5Z_FILTER_SCALEOFFSET  The scale-offset compression filter
H5Z_FILTER_SHUFFLEThe shuffle algorithm filter
H5Z_FILTER_FLETCHER32The Fletcher32 checksum, or error checking, filter

Custom filters that have been registered with the library will have additional unique identifiers.

See HDF5 Dynamically Loaded Filters for more information on how an HDF5 application can apply a filter that is not registered with the HDF5 library.

--- Last Modified: August 23, 2019 | 10:05 AM