h5part AT lists.psi.ch

Subject: H5Part development and discussion

List archive

[H5part] H5Part & Performance

From: Andreas Adelmann <andreas.adelmann AT psi.ch>
To: h5part AT lists.psi.ch
Cc: mike folk <mfolk AT ncsa.uiuc.edu>, Elena Pourmal <epourmal AT hdfgroup.org>
Subject: [H5part] H5Part & Performance
Date: Thu, 10 Apr 2008 13:08:14 +0200
List-archive: <https://lists.web.psi.ch/pipermail/h5part/>
List-id: H5Part development and discussion <h5part.lists.psi.ch>

Dear colleagues please check out the interesting
observations made by John Biddiscombe.

We should discuss this issue! John agreed to talk at our BD-Palaver
sometimes early in summer.

Best Andreas

=================== START MAIL OF John Biddiscombe ===================

Further to my earlier message I forgot to write about one additional consideration that came up whilst working at EDF.

The particles are divided into a number of categories.
Boundary Particles - static
Boundary particles which can move (and generally impart a force onto fluid particles)
Fluid particles
Other particles (suchas air in multi material simulations)
And Fluid2, fluid3, fluid4 depending on other categorizations...

the division of types is arbitrary, but EDF, Manchester, Nantes, Plymouth and others that I can't recall right now all store a flag (int) which indicates particle type. For 1E6 particles, at N time steps, this extra flag contributes IO and disk space and serves little purpose - especially as when reading the data, we have to scan the particles, extract all the type1,, type2, type3 etc and create separate datasets from them when we do certain renderings. Much better to store the particles in groups, much like a multi-block dataset.
Using the compound type we can easily create a group or a file representing a time step and write multiple datasets - one for each block.
Within H5Part one could do the same by using extra sub-groups within the timestep group (though it might complicate the reading a bit).

Anyway, since I've written so much already, I though I'd add this extra observation.

JB

-----------------------------

Jean, Andreas, Achim, please forward to others if necessary. (I have not
posted this to the H5Part list, but please do so if you think it is of
interest to other readers).

In response to a query from Jean, and in time to provoke discussion at
the CSCS User assembly on Friday where I believe you will meet, here's a
brief synopsis of my H5Part related experiences, both recently and
further back.

I have been using H5Part for some time now and have developed a reader
and writer class for vtk/paraview which I use on a daily basis. I have
also developed converters which allow me to convert virtually any ASCII
file (which is common for the SPH community since mostly they are in
their relative infancy and still do tests on small numbers of particles
with ASCII IO) into H5Part. Full details can be found on the pv-meshless
wiki in at https://twiki.cscs.ch/twiki/bin/view/ParaViewMeshless in the
section on Data formats.

On the whole I have had no problems with H5Part and find it a convenient
library to use. I really only use the open file and open time step
(group) functions within the library as I have implemented most of the
hyperslab selection stuff myself within vtk reader/writer classes. I
have not yet looked at H5Block - and I do not see myself doing so as I
have managed to store my volume data using hdf5/Xdmf calls and then
using the Xdmf Readers within vtk/paraview to read the data (subject to
some fixes I made and am in the process of extending).

My most recent tests were at EDF in France, where I had access to the
BlueGene machine. They converted their IO to dump data in HDF5 files,
but did not use H5Part, instead used an H5T_COMPOUND structure to write
all data out in a single call. I modified their IO to use an H5Part form
(for my convenience/compatibility), but we quickly found that
performance was very poor in comparison. After some further testing last
week, I discovered that my stripped down test was setting the collective
IO flag always false, so the expected speed up never happened. Having
fixed this, the speed difference between compound and H5Part dropped to
a factor of about 2. I set a number of tests running on BG using
1,2,3....20 scalar arrays, on
1000,5000,10000,50000,100000,500000,1E6,5E6,1E7,5E7 particles using
collective/independent and running on 32,64,128,256,512,1024,2048
processors - giving a total of 10*7*20*2 combinations of timings (might
have got these numbers wrong as I'm writing from memory, and some of the
larger particle write tests failed and were skipped). The timing results
are sitting on BG at montpellier and I'm awaiting an IBM engineer to
send them to me as I cannot acccess the machine from here. I expect the
results to maintain a 2:1 difference (or thereabouts), but I'll compile
a full document when I have the data. If the results prove interesting
enough and worth discussing, then I will try writing a short paper to
submit somewhere with observations about efficient data IO and
strategies etc. I will rerun tests with different cache options and
other configuration tweaks as and when I can.

Faced with the choice of implementing IO using H5Part - or using
H5Compound type with a factor of 2 speed difference. And based on my
estimate of a few days work to implement a reader for paraview to
support the new type. We decided to use the new format instead of H5Part
for further IO (writes). This decision can be easily changed by simply
swapping the IO calls in their SPH code to use the H5Part compatible
version should we find that timings do in fact come out in favour of
H5Part. Typically they anticipate using anything from 1000 to 100,000
particles per processor, but using very large numbers of processors -
the tests I performed were designed to excercise this pattern - which I
suspect differs somewhat from the anticipated use cases of PSI/and their
colleagues etc. Now that I have a working reader for the new H5 Compound
data format (which I refer to as H5SPH), I will put together a set of
code snippets that other SPH users can use for their IO.
In fact, using H5Part style interface for OpenFile, SetTimeStep,
SetNumberOfParticles, SetView, is basically the same for H5Part and
H5SPH, but these 4 functions themselves are not much work to implement,
so whilst its a shame to redo this work for an alternative
format/library, it isn't actually much work.

As mentioned, I have also been using Xdmf format to store volume blocks
of data. Xdmf is simply an hdf5 file for heavy data, and xml for light
data. This has proven to be very flexible, and I am considering adopting
it as a wrapper around all my hdf5 based data (including H5Part) -
implying that I could probably read H5Part formatted particle data using
the XdmfReader (though I have not actually tried this - it would require
the generation of an xml wrapper for H5Part files, but then the
vtkH5PartReader would not be needed and all maintenance could be shifter
to one place). I'm not sure at the moment if Xdmf is capable of wrapping
Compound data types as used by the new format, so I still have too many
readers and too many formats, I will be looking into this during the
coming weeks.
One reason for liking Xdmf is that within Xdmf it is simple to store N
timesteps of data in one hdf5 file, the next N in another, etc etc. One
issue I've had with H5Part is that file sizes keep growing, and limiting
individual files to around 50GB is convenient for us. So (say) 50 time
steps in one file, 50 in the next, etc etc is useful. It would of course
be quite simple to add this functionality to the H5Part libraries (and
perhaps is already in there), but since I am already a heavy
vtk/paraview and now xdmf user with commit priviliges to these
repositories, it makes my life easier to focus now on Xdmf. I therefore
see myself moving towards Xdmf in the longer term as it allows a greater
variety of storage forms and flexibility. I will not stop using H5Part
for existing data and will continue to follow the developments of the
extra features that keep going in...but for now, It already does all
that I need and needs no real improvement. For our partners who are now
switching to much bigger simulations, I keenly await the timing results
from BG and the opportunity to run more tests which should also include
timings for reading data back into visualization or other post
processing software, which will usually occur on fewer processors.

-------------

To summarize : H5part works very well for all the data I already have
and any new stuff that comes my way. H5SPH may be used by future big
data generators and I will gradually shift to using this myself if I get
more H5SPH data than H5Part. For other hdf data types, I will focus on
Xdmf so that as much as possible code can be unified into a single
package with hopefully less maintenance.

Disclaimer : I have not explored some of the recent 'features' or
developments for extracting subsets of data in H5Part, however, I will
look into this as and when users have requests could benefit from it/them.

JB

--
John Biddiscombe, email:biddisco @ cscs.ch
http://www.cscs.ch/about/BJohn.php
CSCS, Swiss National Supercomputing Centre | Tel: +41 (91) 610.82.07
Via Cantonale, 6928 Manno, Switzerland | Fax: +41 (91) 610.82.82

=================== END MAIL OF John Biddiscombe ===================

-- 
Dr. sc. math. Andreas (Andy) Adelmann
Staff Scientist
Paul Scherrer Institut WLGB/132 CH-5232 Villigen PSI
Phone Office: xx41 56 310 42 33 Fax: xx41 56 310 31 91
Phone Home: xx41 62 891 91 44
-------------------------------------------------------
Wednesday: ETH CAB F 10.1  xx41 44 632 82 76
=======================================================
The more exotic, the more abstract the knowledge, 
the more profound will be its consequences."
Leon Lederman 
=======================================================

[H5part] H5Part & Performance, Andreas Adelmann, 04/10/2008