Page tree

What is a Datatype?

A datatype is a collection of datatype properties which provide complete information for data conversion to or from that datatype.

Datatypes in HDF5 can be grouped as follows:

  • Pre-Defined Datatypes:   These are datatypes that are created by HDF5. They are actually opened (and closed) by HDF5, and can have a different value from one HDF5 session to the next.
  • Derived Datatypes:   These are datatypes that are created or derived from the pre-defined datatypes. Although created from pre-defined types, they represent a category unto themselves. An example of a commonly used derived datatype is a string of more than one character.

Pre-Defined Datatypes

The properties of pre-defined datatypes are:

  • Pre-defined datatypes are opened and closed by HDF5.
  • A pre-defined datatype is a handle and is NOT PERSISTENT. Its value can be different from one HDF5 session to the next.
  • Pre-defined datatypes are Read-Only.
  • As mentioned, other datatypes can be derived from pre-defined datatypes.

There are two types of pre-defined datatypes, standard (file) and native:

  • STANDARD

    A standard (or file) datatype can be:

    • Atomic: A datatype which cannot be decomposed into smaller datatype units at the API level.
      The atomic datatypes are:   integer, float, string, date and time, bitfield, reference, opaque

       

    • Composite: An aggregation of one or more datatypes.
      Composite datatypes include:   array, variable length, enumeration, compound datatypes

      Array, variable length, and enumeration datatypes are defined in terms of a single atomic datatype, whereas a compound datatype is a datatype composed of a sequence of datatypes.

    Notes:
    •  Standard pre-defined datatypes are the SAME on all platforms.
    •  They are the datatypes that you see in an HDF5 file.
    •  They are typically used when creating a dataset.
  • NATIVE

    Native pre-defined datatypes are used for memory operations, such as reading and writing. They are NOT THE SAME on different platforms. They are similar to C type names, and are aliased to the appropriate HDF5 standard pre-defined datatype for a given platform.

    For example, when on an Intel based PC, H5T_NATIVE_INT is aliased to the standard pre-defined type, H5T_STD_I32LE. On a MIPS machine, it is aliased to H5T_STD_I32BE.

    Notes:
    •  Native datatypes are NOT THE SAME on all platforms.
    •  Native datatypes simplify memory operations (read/write). The HDF5 library automatically converts as needed.
    •  Native datatypes are NOT in an HDF5 File. The standard pre-defined datatype that a native datatype corresponds to is what you will see in the file. 

The following table shows the native types and the standard pre-defined datatypes they correspond to. (Keep in mind that HDF5 can convert between datatypes, so you can specify a buffer of a larger type for a dataset of a given type. For example, you can read a dataset that has a short datatype into a long integer buffer.)

Fig. 1   Some HDF5 pre-defined native datatypes and corresponding standard (file) type

  C Type  HDF5 Memory Type    HDF5 File Type *
Integer:
  int  H5T_NATIVE_INT  H5T_STD_I32BE or H5T_STD_I32LE
  short  H5T_NATIVE_SHORT  H5T_STD_I16BE or H5T_STD_I16LE
  long  H5T_NATIVE_LONG  H5T_STD_I32BE, H5T_STD_I32LE,
  H5T_STD_I64BE or H5T_STD_I64LE
  long long    H5T_NATIVE_LLONG  H5T_STD_I64BE or H5T_STD_I64LE
  unsigned int  H5T_NATIVE_UINT  H5T_STD_U32BE or H5T_STD_U32LE
  unsigned short  H5T_NATIVE_USHORT  H5T_STD_U16BE or H5T_STD_U16LE
  unsigned long  H5T_NATIVE_ULONG  H5T_STD_U32BE, H5T_STD_U32LE,
  H5T_STD_U64BE or H5T_STD_U64LE
  unsigned long
    long
  H5T_NATIVE_ULLONG  H5T_STD_U64BE or H5T_STD_U64LE
Float:
  float  H5T_NATIVE_FLOAT  H5T_IEEE_F32BE or H5T_IEEE_F32LE  
  double  H5T_NATIVE_DOUBLE  H5T_IEEE_F64BE or H5T_IEEE_F64LE  
  F90 Type   HDF5 Memory Type   HDF5 File Type *
  integer    H5T_NATIVE_INTEGER  H5T_STD_I32(8,16)BE or H5T_STD_I32(8,16)LE
  real  H5T_NATIVE_REAL  H5T_IEEE_F32BE or H5T_IEEE_F32LE 
  double-
   precision
  H5T_NATIVE_DOUBLE  H5T_IEEE_F64BE or H5T_IEEE_F64LE 
* Note that the HDF5 File Types listed are those that are most commonly created.
  The file type created depends on the compiler switches and platforms being
  used. For example, on the Cray an integer is 64-bit, and using H5T_NATIVE_INT (C)
  or H5T_NATIVE_INTEGER (F90) would result in an H5T_STD_I64BE file type.

The following code is an example of when you would use standard pre-defined datatypes vs. native types:

   #include "hdf5.h"

   main() {

      hid_t       file_id, dataset_id, dataspace_id;  
      herr_t      status;
      hsize_t     dims[2]={4,6};
      int         i, j, dset_data[4][6];

      for (i = 0; i < 4; i++)
           for (j = 0; j < 6; j++)
            dset_data[i][j] = i * 6 + j + 1;

      file_id = H5Fcreate ("dtypes.h5", H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);

      dataspace_id = H5Screate_simple (2, dims, NULL);

      dataset_id = H5Dcreate (file_id, "/dset", H5T_STD_I32BE, dataspace_id, 
                              H5P_DEFAULT);

      status = H5Dwrite (dataset_id, H5T_NATIVE_INT, H5S_ALL, H5S_ALL, 
                         H5P_DEFAULT, dset_data);

      status = H5Dclose (dataset_id);

      status = H5Fclose (file_id);
   }

By using the native types when reading and writing, the code that reads from or writes to a dataset can be the same for different platforms.

Can native types also be used when creating a dataset? Yes. However, just be aware that the resulting datatype in the file will be one of the standard pre-defined types and may be different than expected.

What happens if you do not use the correct native datatype for a standard (file) datatype? Your data may be incorrect or not what you expect.

Derived Datatypes

ANY pre-defined datatype can be used to derive user-defined datatypes.

To create a datatype derived from a pre-defined type:

  • Make a copy of the pre-defined datatype:
        tid = H5Tcopy (H5T_STD_I32BE);
    
  • Change the datatype.

    There are numerous datatype functions that allow a user to alter a pre-defined datatype. Refer to the Datatype Interface in the HDF5 Reference Manual. Example functions are H5Tset_size and H5Tset_precision.

Character Strings:

A simple example of creating a derived datatype is using the string datatype, H5T_C_S1 (H5T_FORTRAN_S1), to create strings of more than one character:

hid_t strtype;                     /* Datatype ID */
herr_t status;

strtype = H5Tcopy (H5T_C_S1);
status = H5Tset_size (strtype, 5); /* create string of length 5 */

The ability to derive datatypes from pre-defined types allows users to create any number of datatypes, from simple to very complex.

--- Last Modified: December 06, 2017 | 08:05 AM