HDF5 Release 1.8.0 represents a major update to the HDF5 Library, utilities, and file format. The HDF5 development team has attempted to provide new capabilities and improve performance while retaining compatibility with previous releases.
The new features are briefly described below, but first a few words regarding the compatibility solutions.
When new features and optimizations are introduced, as is certainly the case in this release, there is always the risk of creating compatibility problems. These problems can arise either with an application that must be ported to the new release (or cannot be ported, for any of a number of reasons), with applications based on a prior release that must read files created by the new release, or with files created by an older release that must work with an application based on the new release. The HDF5 team has made a concerted effort to provide a full range of compatibility solutions, hopefully addressing all of the situations a user or application is likely to encounter.
Interface — Backward and Forward API Compatibility:
This release contains many new features and related API routines, but at the same time attempts to provide stability for applications by continuing to make existing API routines available and by operating in a backwardly compatible manner, whenever possible.
API Compatibility Macros in HDF5 discusses the specifics of API compatibility and configuration options with respect to new features.
Format — Backward and Forward Format Compatibility:
The HDF5 Library Release 1.8.0 reads all existing HDF5 files, from this or any prior release. Although this release contains features that require additions and/or changes to the HDF5 file format, by default this release will write out files that conform to a “maximum compatibility” principle. That is, files are written with the earliest version of the file format that describes the information, rather than always using the latest version possible. This provides the best forward compatibility by allowing the maximum number of older versions of the library to read files produced with this release.
If library features are used that require new file format features, or if the application requests that the library write out only the latest version of the file format, the files produced with this version of the library may not be readable by older versions of the HDF5 library.
New Features in HDF5 Release 1.8.0 and Backward/Forward Format Compatibility Issues discusses the new features in the release from the point of view of their impact on format comaptibility.
New features are briefly described in this section. Further, instructional example codes for several of these features are provided here:
While all new APIs are documented in the HDF5 Reference Manual, there has not been time yet to describe all of them in the HDF5 User’s Guide.
Tunable properties enable the creation of files selectively compatible with older HDF5 applications and libraries. This feature enables the library, and thus an application, to create files that can be read by specific older HDF5 libraries and tools and by applications that those same use libraries.
This is accomplished with the function H5Pset_libver_bounds, which sets the lower and upper bounds on allowable formats. The lower bound is determined by specifying the earliest library whose format may be used for an object; the upper bound is determined by specifying the latest library whose format may be used for an objects.
The function H5Pget_libver_bounds can be used to retrieve the current settings.
For groups with only a few links, compact link storage allows groups containing only a few links to take up much less space in the file.
On the other hand, an improved implementation of indexed link storage provides a faster and more scalable method for storing and working with large groups containing many links.
The threshold for switching between the compact and indexed storage formats is configurable according an application’s or a user community’s expected use cases using the function H5Pset_link_phase_change.
The function H5Pget_link_phase_change can be used to retrieve the current settings.
External links allow a group to include objects in another HDF5 file and enable the library to access those objects as if they are in the current file. In this manner, a group may appear to directly contain datasets, named datatypes, and even groups that are actually in a different file. This feature is implemented via a suite of functions that create and manage the links, define and retrieve paths to external objects, and interpret link names:H5Lcreate_external
The user-defined link feature enables the definition of customized types of links that meet specific community or application needs. This feature is implemented via a suite of functions that define, create, register and unregister the link types:H5Lcreate_ud
Links in a group can now be explicitly tracked and definitively indexed by the order in which they are created, enabling systematic iteration and lookup of links by creation order. This complements the already-existing alphanumeric-by-name capability.H5Pset_link_creation_order
New link APIs enables greater flexibility in the creation and management of links in an HDF5 file. The H5L routines allow links to be managed and manipulated more like objects in the HDF5 data model and provide detailed control of linking behavior.H5L: Link interface
The Attribute interface (H5A) includes several new functions for attribute management. When large numbers of attributes are attached to a single object, new functionality enables faster access and allows those attributes to be stored in much less space in the file.For new attribute management functions:
Attributes can now be tracked and indexed on the order in which they are created, enabling iteration and lookup of attributes by creation order as well as alphanumeric order by name.H5Pset_attr_creation_order
To conserve space in an HDF5 file, large header messages that are used repeatedly in the file can be designated as shared.
A shared object header message (SOHM) is written only once in a file then a pointer is inserted instead of the message itself on each object to which the header message would otherwise be attached. This can be particularly valuable when, for instance, an identical attribute is applied to tens of thousands of objects. (Note that there is will be no advantage if the attribute itself is smaller than the pointer would be.)
This feature is implemented via a suite of functions that set up SOHM tracking and indexing and manage the thresholds for switching between shared and non-shared messages:H5Pset_shared_mesg_nindexes
UTF-8 Unicode encoding is supported for strings in datasets, the names of links, and the names of attributes.
Metadata caching enhancements boost performance with certain types of files and enable configurable metadata cache management and monitoring.
A suite of functions is provided to set and review the metadata cache configurations, to review and reset hit rate statistics, and to retrieve the current cache size:H5Fget_mdc_config
See “Metadata Caching in HDF5” in the HDF5 User’s Guide for further information.
Rather than having to step through a hierarchy creating groups one at a time, intermediate groups that do not yet exist can now be created when creating or copying an object in a file.
See Creating Missing Groups (PDF) for further information.
With this feature, an object in an HDF5 file can easily be copied to a new location within the current file or to a specified location in another HDF5 file. This is accomplised at a low-level in the HDF5 file, allowing entire group hierarchies to be copied quickly and compressed datasets to be copied without going through a decompression/compression cycle.
A suite of functions is provided to manage copy properties and o perform the copying operation:H5Ocopy
A command-line tool, h5copy is also provided to enable copying objects without having to create an application.
Three new functions have been added to enhance the object information that can be retrieved.H5Lget_info retrieves information regarding a link.
In each case, the function returns object information in a customized struct. For example, H5Lget_info returns the link type while H5Gget_info returns the number of links in the group.
Anonymous object creation enables the creation and management of objects in a file independently of the links that integrate those objects into the file structure.H5Dcreate_anon
The above routines are used in conjunction with the Link and Object interfaces discussed elsewhere (H5L and H5O, repectively).H5L: Link interface
A new object API enables greater flexibility in the creation and linking of objects in an HDF5 file.H5O: Object interface
See H5Tconvert in the HDF5 Reference Manual.
H5LTtext_to_dtype creates an HDF5 data type based on the text description and returns the data type identifier. Given a datatype identifier, H5LTdtype_to_text creates a DDL description of the datatype.
Also see “Conversion Between Text and Datatype.”
o N-Bit Filter – This filter compresses data which uses N-bit datatypes. See H5Pset_nbit in the HDF5 Reference Manual and the section “Using Filters / N-bit” in the “Datasets” chapter of the HDF5 User’s Guide.
o Scale+Offset Filter – This filter compresses scalar (integer and floating-point) datatypes which stay within a range. See H5Pset_scaleoffset in the HDF5 Reference Manual and the section “Using Filters / Scale-Offset” in the “Datasets” chapter of the HDF5 User’s Guide.
The Packet Table API (H5PT) is designed to allow variable-length records to be added to tables easily.
The Dimension Scale API (H5DS) allows dimension scales to be created in HDF5 and attached to HDF5 datasets. Also see “HDF5 Dimension Scale Specification and Design Notes” (PDF).
o h5mkgrp is a new command-line tool that creates a new group in an HDF5 file.
o h5stat (PDF) enables the analysis of an HDF5 file in various ways to determine useful statistics regarding the objects in the file, such as the numbers of objects per group, the sizes of datasets, the amount of free space in the file, etc.
o h5copy makes a complete copy of an object in an HDF5 file as a new object in that HDF5 file or as a new object in a different HDF5 file.
o Improved speed of h5dump – Performance improvements have been made to h5dump to speed it up when dealing with files that have large numbers of objects.
See H5Screate in the HDF5 Reference Manual.
In the HDF5 Reference Manual, see the error stack APIs. Also see the supporting document “Unified Error Reporting for HDF5 and Client Libraries.”