When a non-empty filter pipeline is used with a group creation property list, the group will be created with the new group file format (see Group Implementations in HDF5). The filters will come into play only when dense storage is used (see H5P_SET_LINK_PHASE_CHANGE) and will be applied to the group’s fractal heap. The fractal heap will contain most of the the group’s link metadata, including link names. When working with group creation property lists, if you are adding a filter that is not in HDF5’s set of predefined filters, i.e., a user-defined or third-party filter, you must first determine that the filter will work for a group. See the discussion of the set local and can apply callback functions in H5Z_REGISTER. If multiple filters are set for a property list, they will be applied to each chunk of raw data for datasets or each block of the fractal heap for groups in the order in which they were set. Filters can be applied only to chunked datasets; they cannot be used with other dataset storage methods, such as contiguous, compact, or external datasets. Dataset elements of variable-length and dataset region reference datatypes are stored in separate structures in the file called heaps. Filters cannot currently be applied to these heaps. Filter Behavior in HDF5: Filters can be inserted into the HDF5 pipeline to perform functions such as compression and conversion. As such, they are a very flexible aspect of HDF5; for example, a user-defined filter could provide encryption for an HDF5 dataset. A filter can be declared as either required or optional. Required is the default status; optional status must be explicitly declared. A required filter that fails or is not defined causes an entire output operation to fail; if it was applied when the data was written, such a filter will cause an input operation to fail. The following table summarizes required filter behavior. | Required FILTER_X not available | FILTER_X available |
---|
H5Pset_<FILTER_X> | Will fail. | Will succeed. | H5Dwrite with FILTER_X set | Will fail. | Will succeed; FILTER_X will be applied to the data. | H5Dread with FILTER_X set | Will fail. | Will succeed. |
An optional filter can be set for an HDF5 dataset even when the filter is not available. Such a filter can then be applied to the dataset when it becomes available on the original system or when the file containing the dataset is processed on a system on which it is available. A filter can be declared as optional through the use of the H5Z_FLAG_OPTIONAL flag with H5P_SET_FILTER. Consider a situation where one is creating files that will normally be used only on systems where the optional (and fictional) filter FILTER_Z is routinely available. One can create those files on system A, which lacks FILTER_Z, create chunked datasets in the files with FILTER_Z defined in the dataset creation property list, and even write data to those datasets. The dataset object header will indicate that FILTER_Z has been associated with this dataset. But since system A does not have FILTER_Z, dataset chunks will be written without it being applied. HDF5 has a mechanism for determining whether chunks are actually written with the filters specified in the object header, so while the filter remains unavailable, system A will be able to read the data. Once the file is moved to system B, where FILTER_Z is available, HDF5 will apply FILTER_Z to any data rewritten or new data written in these datasets. Dataset chunks that have been written on system B will then be unreadable on system A; chunks that have not been re-written since being written on system A will remain readable on system A. All chunks will be readable on system B. The following table summarizes optional filter behavior. | FILTER_Z not available | FILTER_Z available with encode and decode | FILTER_Z available decode only |
---|
H5Pset_<FILTER_Z> | Will succeed. | Will succeed. | Will succeed. | H5Dwrite with FILTER_Z set | Will succeed; FILTER_Z will not be applied to the data. | Will succeed; FILTER_Z will be applied to the data. | Will succeed; FILTER_Z will not be applied to the data. | H5Dread with FILTER_Z set | Will succeed if FILTER_Z has not actually been applied to data. | Will succeed. | Will succeed. |
The above principles apply generally in the use of HDF5 optional filters insofar as HDF5 does as much as possible to complete an operation when an optional filter is unavailable. (The SZIP filter is an exception to this rule; see H5P_SET_SZIP for details.) Also see Data Flow Pipeline for H5Dread. |