This page is under construction and subject to change.
HDF5 1.12 introduces several new features in the HDF5 library.
Avoid Truncation at File Close (RFC)
The HDF5 library builds mechanisms for detecting file truncation into its file format. These mechanisms are implemented in a way that is slow, particularly for parallel I/O. The "avoid truncate" feature achieves the same result in a way that is faster that will not create large forward compatibility problems. This is described in more detail below.
The HDF5 library tracks two pieces of information about the size of an HDF5 file in memory. The "end of allocation" (EOA) value indicates how much of the file has been allocated for use by some piece of the HDF5 file format. The "end of file" (EOF) value indicates the location of the highest byte actually written in the file by the HDF5 library. These two values are frequently not the same during normal operation of the library.
At file close, the file is truncated to match the EOA value, the EOF value is set to the EOA value, and the EOF is stored in the file superblock. When re-opening the file, the size of the file is checked against the stored EOF value and the library will issue an error that the file was truncated if the size is smaller than the EOF value.
The "avoid truncate" feature allows the library to store the EOA in a superblock extension message and use it for file truncation detection, rather than truncating the file during file close.
Dynamically Loadable Virtual File Drivers – VFDs (RFC)
Dynamically loadable Virtual File Drivers (VFDs) enable virtual file drivers to be included with HDF5, without requiring a rebuild of the HDF5 library. This greatly simplifies the release of VFDs, especially proprietary VFDs where the source code should not be publicly visible. This feature is critical for distribution of the S3 and HDFS drivers, as well as drivers created by the community.
|H5P_GET_FAPL_HDFS||Gets the information of the given Read-Only HDFS virtual file driver|
|H5P_GET_FAPL_ROS3||Gets the information of the given Read-Only S3 virtual file driver|
H5Fdelete and Changes to the Virtual File Layer (VFL) (RFC)
With the new virtual object layer (VOL), HDF5 "files" can map to arbitrary storage schemes such as object stores and relational database tables. The data created by these implementations may be inconvenient for a user to remove without a detailed knowledge of the storage scheme. The H5Fdelete() API was introduced to give VOL connector authors the ability to add connector-specific delete code to their connectors so that users can remove these "files" without detailed knowledge of the storage scheme.
|H5F_DELETE||Deletes an HDF5 file|
Since HDF5 storage can differ among the virtual file drivers, changes had to be made so that each Virtual File Driver (VFD) could have its own driver-specific cleanup code.
H5Sencode performance was improved by encoding the start/stride/count/block in most cases when the selection is regular
The previous format for H5Sencode enumerated every block in the selection. For example, if the selection was a checkerboard, the serialized selection described the location of every block independently, instead of specifying the pattern of the blocks. This created performance and file space issues if the selection used a large number of blocks.
VDS and Hyperslab Performance Improvements
VDS and Hyperslab performance have been improved dramatically.
References API (RFC)
The HDF5 Reference API in previous releases had some limitations, Users could not create region references when a file was opened with read-only permissions, and external references could not be created. This release addresses these limitations by introducing a single abstract h5ref_t type as well as missing reference types such as attribute references and external references (references to objects in an external file).
The Virtual Object Layer (VOL) is an abstraction layer within the HDF5 library that enables different methods for accessing data and objects that conform to the HDF5 data model. The VOL intercepts all HDF5 API calls that potentially modify data on disk and forwards those calls to a plugin "object driver". The data on disk can be a different format than the HDF5 format.
The plugins can actually store the objects in variety of ways. A plugin could, for example, have objects be distributed remotely over different platforms, provide a raw mapping of the model to the file system, or even store the data in other file formats (like native netCDF or HDF4 format). The user still gets the same data model where access is done to a single HDF5 “container”; however the plugin object driver translates from what the user sees to how the data is actually stored. Having this abstraction layer maintains the object model of HDF5 and allows better usage of new object storage file systems that are targeted for Exascale systems.
The following APIs were introduced to support this feature:
|H5VL_CLOSE||Closes a VOL connector identifier|
|H5VL_GET_CONNECTOR_ID||Retrieves the identifier for a registered VOL connector|
Retrieves the connector name for the VOL associated with the object or file identifier
|H5VL_IS_CONNECTOR_REGISTERED||Tests whether a VOL class has been registered or not|
|H5VL_REGISTER_CONNECTOR||Registers a new VOL connector|
|H5VL_REGISTER_CONNECTOR_BY_NAME||Registers a new VOL connector by name|
|H5VL_REGISTER_CONNECTOR_BY_VALUE||Registers a new VOL connector by connector value|
|H5VL_UNREGISTER_CONNECTOR||Removes a VOL connector identifier from the library|