Page tree

The license could not be verified: License Certificate has expired!

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »


How do you build HDF5 on BlueGene/L?

Here are a few steps to get you started on building HDF5 on BlueGene/L:

These are the compilers that we have used to build HDF5 on this platform:

C:  mpxlc
C++: mpxlC
F77: mpxlf
F90: mpxlf90

First try building without C++/Fortran support, unless you need those.  If you 
don't have these compilers available to you, you should 
find out from your sysadmin what the MPI compilers on the system are.


Building:

You should run the ./configure script through yodconfigure like so:

./bin/yodconfigure configure

Then, set the RUNSERIAL environment variable.  For example:

mpirun -np 1 -cwd /g/g23/xxxx/hdf5

Note that the cwd parameter should be set to whatever directory you're 
building HDF5 in.  The RUNSERIAL environment variable tells the configure 
script to execute all binaries (the configure script builds several dozen 
small test binaries to execute and see how them work in order to deduce the 
characteristics of the computing environment) on the batch nodes rather 
than on the login nodes.  This is necessary because you ***MUST*** have 
configure execute its test binaries on the batch nodes since these have 
different characteristics than the login nodes.  A binary compiled to work on 
the login nodes will NOT work on the batch nodes.

Also note that this is slightly different than running the entire configure 
script as a batch job.  Although that might work (and would certainly achieve 
the same effect), on most systems the batch nodes have a very minimal 
computing environment and sometimes do not even have a shell installed (let 
alone compilers) so the configure script wouldn't run on them.  If you're 
feeling adventurous, you can try just submitting the entire configure script 
as a batch job on a single node.  Don't be surprised if that fails, though.

Lastly, use the following line to configure:

./configure --enable-parallel --enable-fortran --disable-cxx --disable-stream-vfd

You may want to disable fortran, if you don't need it.  The stream vfd doesn't 
compile on BlueGene/L, so you'll want to turn that off as well.  As you've 
probably guessed, forcing the configure script to run each individual binary 
as a separate process on the batch node will cause the configure script to 
take a VERY long time to execute (~45 minutes for me).  

As one final note, if you're using fortran, you may need to set the following 
environment variable:

export FFLAGS="-qsuffix=f=f90"

More Information:
-----------------

What yodconfigure does is to simply prefix all serial executions with the 
contents of the RUNSERIAL variable.  It just rewrites part of the configure 
script, so you still have to run configure after using yodconfigure.  So, 
for example, instead of executing:

./a.out

after being run through yodconfigure, the configure script will execute:

RUNSERIAL ./a.out

And if runserial is empty, it's equivalent to the original configure, BUT if 
RUNSERIAL is set to something like mpirun -np 1 the command would become:

mpirun -np 1 ./a.out

which ensures that the binary is run on a compute node rather than the login 
node.

If you find this isn't working, one quick way to test is to write a hello 
world MPI program and try to run it on a compute node like this:

mpirun -np 1 -cwd /g/g23/xxxx/test.out

If that doesn't work, either the command for launching compute node jobs is 
different (which you'd have to look up for your system) or the system does 
not allow arbitrary launching of compute node jobs (ie, everything has to be 
scheduled) in which case none of this will work for you and your best 
shot would be to try to run the entire configure script as a batch job. We
have never tried that, though, and don't know if it would work, but the 
yodconfigure technique requires the ability to launch jobs on compute nodes 
with a single command that returns when the binary has finished 
executing (and sends back the test binary its return value).

Cross-compiling systems are a huge pain to deal with.