About the Project
For NOAA and NASA, the data problem with their Joint Polar Satellite System (JPSS) is how to best handle a large volume data stream from the five different instruments on a satellite. JPSS is a new generation of low Earth orbiting satellites that monitor environmental conditions and provide data for long-range weather and climate forecasts.
How might data be extracted in a timely manner from such a large dataset? A data granule holds the data from a short period of observation by an instrument. Data is stored by granule in HDF5 files. With a custom tool built by The HDF Group, data granules can be aggregated and extracted. This means that only the data for a certain period of time and for a limited location need be retrieved from the data for study. This is much more efficient than having to open the entire file to see any amount of data. Since the archived data is not changed, data granules can be extracted repeatedly, and the data files themselves only need to be downloaded once.
The HDF Group
The software that The HDF Group has developed for the JPSS project is described below.
The HDF Group developers created and currently support the following tools for the JPSS project:
This prototype tool allows users to access their data files using different parameters such as chunking sizes, compression methods, access patterns, and chunk cache settings. The tool provides performance statistics to help users to find the optimum parameters to create and sccess their HDF5 files.
JPSS data is distributed in HDF5 files containing raw data and indexing metadata that allows fast access to the raw data. The HDF Group continues to develop software libraries and tools to improve access to this data. As part of this effort, The HDF Group has created a library of C and Fortran routines to access and manipulate data referenced by object and region references and to access and manipulate data packed into integer values. We continue to seek feedback from JPSS applications developers and users, as well as from the wider HDF5 community, and will improve this library as requested.
The HDF Group is maintaining HDF5 software on the following systems used by JPSS:
- Linux 32 and 64-bit
- AIX 5.3 and 6.1
- Windows 32 and 64-bit
- Mac Intel OS X 10.5 and later
The latest versions of documentation for software developed for the JPSS project are available for download. See the list below.
High-level library for handling HDF5 object and region references
- Reference Manual (pdf)
- User's Guide (pdf)
- Definition of the h5edit Command Language (pdf)
- H5edit BackupVFD Atomicity Performance Study (pdf) ( docx )
- Previous Study: H5edit Atomicity Performance Study
The latest versions of the software developed for the JPSS project are available for download. See the list below.
High-level Library for handling HDF5 object and region references Release 1.1.5 July 20, 2016
The library contains C and Fortran APIs to:
|Linux 2.6 CentOS 6 x86_64||gcc, gfortran 4.4.7 (w/Fortran)|
|Windows (64-bit)||CMake VS 2013 C, gfortran|
|Windows (64-bit)||CMake VS 2015 C, gfortran|
nagg Release 1.6.2 July 22, 2016
nagg is a tool for aggregating JPSS data granules from existing files into new files with a different number of granules per file or different combinations of compatible products than in the original files. The tool was created to provide individual users the ability to rearrange NPP product data granules from downloaded files into new files with aggregations or packaging that are better suited as input for a particular application.
(For earlier versions, see: All Releases)
h5edit Release 1.3.1, November 17, 2014
The h5edit tool is a command-line tool to edit HDF5 files. The current version is limited to operations on attributes only. It supports:
See the Release Notes for complete details on this release.
h5augjpss Release 1.0.0 August 15, 2011
The h5augjpss tool is designed to modify a JPSS HDF5 product file to be accessible by the netCDF-4 version 4.1.3 library. The tool:
See the README.txt file in the source code for information on building and running h5augjpss.
Chunking and Compression Performance Tool prototype July 17, 2016
The Chunking and Compression Performance (CPP) tool is a prototype designed to help assess the effect of using various file parameters and access patterns on performance and storage.