Page tree

The license could not be verified: License Certificate has expired!

This page is under construction and subject to change.

 HDF5 1.12 introduces several new features in the HDF5 library.

Avoid Truncation at File Close (RFC)

The HDF5 library builds mechanisms for detecting file truncation into its file format. These mechanisms are implemented in a way that is slow, particularly for parallel I/O.  The "avoid truncate" feature achieves the same result in a way that is faster that will not create large forward compatibility problems. This is described in more detail below.

The HDF5 library tracks two pieces of information about the size of an HDF5 file in memory. The "end of allocation" (EOA) value indicates how much of the file has been allocated for use by some piece of the HDF5 file format. The "end of file" (EOF) value indicates the location of the highest byte actually written in the file by the HDF5 library. These two values are frequently not the same during normal operation of the library.

At file close, the file is truncated to match the EOA value, the EOF value is set to the EOA value, and the EOF is stored in the file superblock. When re-opening the file, the size of the file is checked against the stored EOF value and the library will issue an error that the file was truncated if the size is smaller than the EOF value.

The "avoid truncate" feature allows the library to store the EOA in a superblock extension message and use it for file truncation detection, rather than truncating the file during file close.

Dynamically Loadable Virtual File Drivers – VFDs  (RFC)

Dynamically loadable Virtual File Drivers (VFDs) enable virtual file drivers to be included with HDF5, without requiring a rebuild of the HDF5 library.  This greatly simplifies the release of VFDs, especially proprietary VFDs where the source code should not be publicly visible. This feature is critical for distribution of the S3 and HDFS drivers, as well as drivers created by the community.

H5Fdelete and Changes to the Virtual File Layer (VFL) (RFC)

With the new virtual object layer (VOL), HDF5 "files" can map to arbitrary storage schemes such as object stores and relational database tables. The data created by these implementations may be inconvenient for a user to remove without a detailed knowledge of the storage scheme. The proposed H5Fdelete() call gives VOL connector authors the ability to add connector-specific delete code to their connectors so that users can remove these "files" without detailed knowledge of the storage scheme.

Since HDF5 storage can differ among the virtual file drivers, changes had to be made so that each Virtual File Driver (VFD) could have its own driver-specific cleanup code.

H5Sencode Changes

Performance Improvements

References API (RFC)

The HDF5 Reference API in previous releases had some limitations, Users could not create region references when a file was opened with read-only permissions, and external references could not be created. This

release addresses these limitations by introducing a single abstract h5ref_t type as well as missing reference types such as attribute references and external references (references to objects in an external file).

Virtual Object Layer (RFC)

The Virtual Object Layer (VOL) is an abstraction layer within the HDF5 library that enables different methods for accessing data and objects that conform to the HDF5 data model. The VOL intercepts all HDF5 API calls that potentially modify data on disk and forwards those calls to a plugin "object driver". The data on disk can be a different format than the HDF5 format.

The plugins can actually store the objects in variety of ways. A plugin could, for example, have objects be distributed remotely over different platforms, provide a raw mapping of the model to the file system, or even store the data in other file formats (like native netCDF or HDF4 format). The user still gets the same data model where access is done to a single HDF5 “container”; however the plugin object driver translates from what the user sees to how the data is actually stored. Having this abstraction layer maintains the object model of HDF5 and allows better usage of new object storage file systems that are targeted for Exascale systems.

The following APIs were introduced to support this feature:

FunctionDescription
H5VL_CLOSECloses a VOL connector identifier
H5VL_GET_CONNECTOR_IDRetrieves the identifier for a registered VOL connector
H5VL_GET_CONNECTOR_NAME

Retrieves the connector name for the VOL associated with the object or file identifier

H5VL_IS_CONNECTOR_REGISTEREDTests whether a VOL class has been registered or not
H5VL_REGISTER_CONNECTORRegisters a new VOL connector
H5VL_REGISTER_CONNECTOR_BY_NAMERegisters a new VOL connector by name
H5VL_REGISTER_CONNECTOR_BY_VALUERegisters a new VOL connector by connector value
H5VL_UNREGISTER_CONNECTORRemoves a VOL connector identifier from the library

--- Last Modified: May 21, 2019 | 02:31 PM