Page tree









Tests Parallel HDF5 performance.


h5perf [-h | --help]
h5perf [options]

h5perf is a tool for testing the performance of the Parallel HDF5 Library. The tool can perform testing with 1-dimensional and 2-dimensional buffers and datasets. For details regarding data organization and access, see the “h5perf User Guide.”

The following environment variables have the following effects on h5perf behavior:

HDF5_NOCLEANUPIf set, h5perf does not remove data files. 
(Default: Data files are removed.)
HDF5_MPI_INFOMust be set to a string containing a list of semi-colon separated key=value pairs for the MPI INFOobject. 
HDF5_PARAPREFIX  Sets the prefix for parallel output data files.

Options and Parameters:
These terms are used as follows in this section:
file  A filename
sizeA size specifier, expressed as an integer greater than or equal to 0 (zero) followed by a size indicator: 
     K for kilobytes (1024 bytes) 
     M for megabytes (1048576 bytes) 
     G for gigabytes (1073741824 bytes) 
Example: 37M specifies 37 megabytes or 38797312 bytes.
NAn integer greater than or equal to 0 (zero)
-h, --help
        Prints a usage message and exits.
-a size, --align=size
        Specifies the alignment of objects in the HDF5 file. 
(Default: 1)
-A api_list, --api=api_list

Specifies which APIs to test. api_list is a comma-separated list with the following valid values:

    phdf5  Parallel HDF5

(Default: All APIs)

Example, --api=mpiio,phdf5 specifies that the MPI I/O and Parallel HDF5 APIs are to be monitored.

-B size, --block-size=size
        Controls the block size within the transfer buffer. 
(Default: Half the number of bytes per process per dataset)

Block size versus transfer buffer size: 
The transfer buffer size is the size of a buffer in memory. The data in that buffer is broken into block size pieces and written to the file.

Transfer buffer size is discussed below with the -x (or --min-xfer-size) and -X (or --max-xfer-size) options.

The pattern in which the blocks are written to the file is described in the discussion of the -I (or --interleaved) option.

-c, --chunk
        Creates HDF5 datasets in chunked layout. 
(Default: Off)
-C, --collective
        Use collective I/O for the MPI I/O and Parallel HDF5 APIs. 
(Default: Off, i.e., independent I/O)

If this option is set and the MPI-I/O and PHDF5 APIs are in use, all the blocks of every process will be written at once with an MPI derived type.

-d N, --num-dsetsN
        Sets the number of datasets per file. 
(Default: 1)
-D debug_flags, --debug=debug_flags

Sets the debugging level. debug_flags is a comma-separated list of debugging flags with the following valid values:

    1  Minimal debugging
 2Moderate debugging (“not quite everything”)
 3Extensive debugging (“everything”)
 4All possible debugging (“the kitchen sink”)
 rRaw data I/O throughput information
 tTimes, in additions to throughputs
 vVerify data correctness

(Default: No debugging)

Example: --debug=2,r,t specifies to run a moderate level of debugging while collecting raw data I/O throughput information and verifying the correctness of the data.

Throughput values are computed by dividing the total amount of transferred data (excluding metadata) over the time spent by the slowest process. Several time counters are defined to measure the data transfer time and the total elapsed time; the latter includes the time spent during file open and close operations. A number of iterations can be specified with the option -i (or --num-iterations) to create the desired population of measurements from which maximum, minimum, and average values can be obtained. The timing scheme is the following:

    for each iteration
        initialize elapsed time counter
        initialize data transfer time counter
        for each file
            start and accumulate elapsed time counter
                file open
                start and accumulate data transfer time counter
                    access entire file
                stop data transfer time counter
                file close
            stop elapsed time counter
        end file
        save elapsed time counter
        save data transfer time counter
    end iteration

The reported write throughput is based on the accumulated data transfer time, while the write open-close throughput uses the accumulated elapsed time.

-e size, --num-bytes=size
        Specifies the number of bytes per process per dataset. 
(Default: 256K for 1D, 8K for 2D)

Depending on the selected geometry, each test dataset can be a linear array of size bytes-per-process * num-processes or a square array of size (bytes-per-process * num-processes) × (bytes-per-process * num-processes). The number of processes is set by the -p (or --min-num-processes) and -P (or --max-num-processes) options.

-F N, --num-files=N
        Specifies the number of files. 
(Default: 1)
-g, --geometry
        Selects 2D geometry for testing. 
(Default: Off, i.e., 1D geometry)
-i N, --num-iterations=N
        Sets the number of iterations to perform. 
(Default: 1)
-I, --interleaved
        Sets interleaved block I/O. 
(Default: Contiguous block I/O)

Interleaved and contiguous patterns in 1D geometry: 
When a contiguous access pattern is chosen, the dataset is evenly divided into num-processes regions and each process writes data to its assigned region. When interleaved blocks are written to a dataset, space for the first block of the first process is allocated in the dataset, then space is allocated for the first block of the second process, etc., until space is allocated for the first block of each process, then space is allocated for the second block of the first process, the second block of the second process, etc.

For example, with a three process run, 512KB bytes-per-process, 256KB transfer buffer size, and 64KB block size, each process must issue two transfer requests to complete access to the dataset.

Contiguous blocks of the first transfer request are written as follows: 

Interleaved blocks of the first transfer request are written as follows: 

The actual number of I/O operations involved in a transfer request depends on the access pattern and communication mode. When using independent I/O with an interleaved access pattern, each process performs four small non-contiguous I/O operations per transfer request. If collective I/O is turned on, the combined content of the buffers of the three processes will be written using one collective I/O operation per transfer request.

For details regarding the impact of performance and access patterns in 2D, see the “h5perf User Guide.”

-m, --mpi-posixThis option is no longer available.
-n, --no-fillSpecifies to not write fill values to HDF5 datasets. This option is supported only in HDF5 Release v1.6 or later. 
(Default: Off, i.e., write fill values)
-o file, --output=fileSets the output file for raw data to file. 
(Default: None)
-p N, --min-num-processes=NSets the minimum number of processes to be used. 
(Default: 1)
-P N, --max-num-processes=N
Sets the maximum number of processes to be used. 
(Default: All MPI_COMM_WORLD processes)
-T size, --threshold=sizeSets the threshold for alignment of objects in the HDF5 file. 
(Default: 1)
-w, --write-onlyPerforms only write tests, not read tests. 
(Default: Read and write tests)
-x size, --min-xfer-size=sizeSets the minimum transfer buffer size. 
(Default: Half the number of bytes per processor per dataset)

This option and the -X size option (or --max-xfer-size=size) control transfer-buffer-size, the size of the transfer buffer in memory. In 1D geometry, the transfer buffer is a linear array of size transfer-buffer-size. In 2D geometry, the transfer buffer is a rectangular array of size block-size × transfer-buffer-size, or transfer-buffer-size × block-size if the interleaved access pattern is selected.

-X size, --max-xfer-size=sizeSets the maximum transfer buffer size. 
(Default: The number of bytes per processor per dataset)

Exit Status:
> 0    An error occurred.

Release              Change
1.6.0Tool introduced in this release.
1.6.8 and 1.8.0Option -g, --geometry introduced in this release.

--- Last Modified: November 17, 2017 | 11:28 AM