Page tree

The license could not be verified: License Certificate has expired!

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

How can I read/write a dataset greater than 2GB?

If you use the default file access property list (serial) for HDF5, you can read or write a dataset greater than 2GB with one call.

However, if you set the FAPL or DXPL to use the MPI-I/O file driver, you will not be able to do this. The problem is that the MPI standard specifies that the 'count' parameter passed to MPI read and write operations be a 32-bit integer.

There are ways in HDF5 to get around this limitation in the standard by concatenating several derived datatypes, in order to reduce the count to a lower number than what a 32-bit integer can hold. However, this also breaks ROMIO (the MPI-I/O implementation used by almost all MPI libraries). This is a known limitation of ROMIO, where the most I/O ROMIO can do in a single operation is 2 GB. That is not the same problem as the 'count' parameter being 32 bytes, but rather a limit in ROMIO itself. So unless a fix is implemented in the ROMIO library, the work around the MPI standard (mentioned above) will not work.

The solution for now is to do multiple read/writes as necessary so that the total number of data read/written per call is less than 2 GB. We have a Parallel HDF5 Tutorial here:

Introduction to Parallel HDF5

See the hyperslab selection examples in the tutorial for how to select a subset of a dataset:

Writing and Reading Hyperslabs