These tools copy all the data sequentially in the file(s) to new offsets. For a large file, this copy will take a long time. The most efficient way to create a user block is to create the file with a user block (see H5Pset_userblock ), and write the user block data into that space from a program. The user block is completely opaque to the HDF5 library and to the h5jam and h5unjam tools. The user block is read or written in a single block as a string of bytes; it can contain text or any kind of binary data; and it is up to the user to know what the user block content means and how to process it. When the user block is extracted, its entire contents are written as a single block of output, including any padding or uninitialized data. This tool moves the HDF5 portion of the file through byte copies; i.e., it does not read or interpret the HDF5 objects. h5jam and h5unjam not necessarily transitive: Note that h5jam and h5unjam are not necessarily transitive operations. Any amount of data can be inserted into a user block, but an HDF5 user block itself has specific size requirements. The minimum size is 512 bytes; beyond that, the user block can be 512 bytes times any positive power of 2. That is, a user block’s size will be one of the following: 512 bytes, 1024 bytes, 2048 bytes, 4096 bytes, et cetera.
If h5jam is used to insert a 700 byte file into the user block, h5jam will create a user block of 1024 bytesa and insert the user’s file as the first 700 bytes of that block. The remaining 324 bytes will be undefined. If the remaining bytes must have a particular fill value, for instance, the user must modify the input file by padding it to exactly 1024 bytes with the required fill value before inserting it with h5jam . When h5unjam is asked to return the above user block, it will be returned with the padding in the last 324 bytes if the user defined it or with undefined data in the last 324 bytes if the user took no action to insert the padding. If the file must be cleaned up for use, it is the user’s or the user application’s responsibility. If a community of users employs user block data that must be cleaned up after the use of h5unjam , the community should establish a protocol for that process so that every community member knows what is required. The community may prefer to create and provide a tool to perform standard cleanup. A simple protocol might be for a user community to declare that the first N bytes of the user block will always contain the length or size of the valid user block content, much as a Pascal string starts with the length of the string data. Also see the HDF5 source code for examples of examining or reading the user block without modifying the file in any way. The relevant source files are the test programs tools/h5jam/tellub.c and tools/h5jam/getub.c . |