h5part AT lists.psi.ch

Subject: H5Part development and discussion

List archive

Re: [H5part] [Fwd: Re: [Fwd: Re: H5Part]]

From: John Biddiscombe <biddisco AT cscs.ch>
To: Andreas Adelmann <andreas.adelmann AT psi.ch>
Cc: h5part AT lists.psi.ch
Subject: Re: [H5part] [Fwd: Re: [Fwd: Re: H5Part]]
Date: Fri, 11 Apr 2008 20:52:33 +0200
List-archive: <https://lists.web.psi.ch/pipermail/h5part/>
List-id: H5Part development and discussion <h5part.lists.psi.ch>

Some replies for further discussion

If you store all of your fields in the COMPOUND structure, then you are forced to read them all back in when analyzing the data, even if you only want one of the fields.

No. At least not in hdf5 1.6 onwards. You can create a compound structure with 20 fields in it, but only read back N of them. It might be the case that internally, hdf is actually reading the lot, but the user doesn't see it. If that is the case, then I agree, it's terrible. So far, reading the data back has not been a problem because
a) I haven't got a lot of the compound data yet
b) our post processing is not so critical when loading into paraview or other tool.
c) I usually read it all back anyway, so I wouldn't notice if I was wasting memory. (note to self : Further tests required.)

However, I would like to add, that I may decide not to support this new compound type if further tests show it to be memory wasteful and slow when we use it a lot. I'm not happy that the simulation guys can seve 3 seconds per IO write, if it costs us 5 per read. or worse!

Secondly, we chose not to use the COMPOUND structure because we wanted to develop a file format that is useful across more than one simulation code. COMPOUND implies a particular ordering of data fields in memory that can be very specific to one code, and would make it very difficult for codes that have different numbers of fields or different orderings of fields in memory.

In my reader, I simply open the compound type, query the fields, and stuff them into the gui. when loading H5Part, I query the datasets and stuff them into the gui. The actual differnce is pretty small really. When reading the arrays, the code is different, but not so much so (notwithstanding your earlier observation of memory wastage etc)

Consider the type management head-aches if every code has to define its own H5T_COMPOUND structure to treat its own native set of fields and data structures. It sounded unmanageable, we chose to lose some performance for the sake of portability and long-term data provenance.

It isn't that hard. Every user can create an compound type and the code will just list the fields in the same way as datasets. No problem.

I would like to understand the motivation for storing metadata in an external XML format (e.g. the Xdml that stores heavy data in one file and light data in another???). This doesn't make as much sense to me.

Because I've now got 3 different HDF5 based readers. For different applications. Jean has 3 or 4 more. We have to maintain 7 readers for a file type that is supposed to be standard. (Not all our users are creating particle data).
But they do all create big datasets - they just group them in different combinations and sub groups.
What xdmf allows me to do is simply list the heavy data in an XML structure, and have a single reader read all the different flavours of hdf5. (ok, not really, but you get the idea).
If a user comes along and says I've got arrays stored like this, with vectors like that, and others like this. We should be able to create a template xml description and use the same reader to read it. The logic of what is where, is in the xml decoding, and is done once. Instead of 7 hdf5 readers with logic about which datastructures are stored relative to others, we can do it in one reader.

In truth this is not happening. We stil have many readers...but I'm hoping that the xdmf approach will save me time in the long run. I'm still mot completely sold on it, but I hope I've described why I'm looking into it.

1) You are replacing a compact, hierarchical object database representation of the metadata with hat amounts to an ASCII flat-file. This seems to be backward

Only the locations and structure of the file. The real data is still the heavy stuff. Instead of coding the geometry extraction in the reader, you code it in the xml and then let the reader decode - but with the flexibility to handle many different types.

2) It is much better for data provenance issues to have a single definitive source for your metadata. There is the danger that the metadata and data files can get mis-matched or misplaced. You can always generate an XML dump of the HDF metadata using h5dump -xml.

True. So far, I'm getting one xml per dataset and one hdf file per timestep so it is't yet a problem.

3) ASCI text of XML is an extraordinarily inaccurate method for representing floating point data, and is very slow to parse

The overhead of reading the geometry description hasn't yet caused trouble. Today I made an animation of 1139 time steps of data that was volumetric, from 1 xml file and 1139 hdf files. The parsing of the xml is not noticable and saves some time, because all the 'information' about spacing, origin, dimensions is in the text file and no h5 calls were made until the heavy data was actually needed.
I think using the xml description is actually faster then opening the hdf and finding array dimensions etc.

There are a number of archival systems, such as SRM, that benefit from having a replicated copy of the metadata to facilitate searching of data where the bulk of the data is on tape. However, it is bad practice to allow the possibility of inconsistency in the representation of the metadata.

OK. I can't say much about this. true that mismatch will be a problem.

So far, my experimentation with xdmf has been
a) very painful - many troubles fixing bugs in vtkXdmfReader etc and adding time support
b) still in it's infancy. I may yet decide it's more effort than it's worth.
c) but ... showing some promise. Having just one reader instead of N is looking attractive to me. Even if the one reader is N times more complicated!

JB.

-john

On Apr 2, 2008, at 8:47 AM, Andreas Adelmann wrote:

FYI, Andreas

--

Dr. sc. math. Andreas (Andy) Adelmann

Staff Scientist

Paul Scherrer Institut WLGB/132 CH-5232 Villigen PSI

Phone Office: xx41 56 310 42 33 Fax: xx41 56 310 31 91

Phone Home: xx41 62 891 91 44

-------------------------------------------------------

Wednesday: ETH CAB F 10.1 xx41 44 632 82 76

=======================================================

The more exotic, the more abstract the knowledge, the more profound will be its consequences."

Leon Lederman =======================================================

From: John Biddiscombe <biddisco AT cscs.ch>

Date: April 2, 2008 7:43:14 AM PDT

To: Jean Favre <jfavre AT cscs.ch>

Cc: Achim Gsell <achim.gsell AT psi.ch>, Andreas Adelmann <andreas.adelmann AT psi.ch>

Subject: Re: H5Part

Jean, Andreas, Achim, please forward to others if necessary. (I have not posted this to the H5Part list, but please do so if you think it is of interest to other readers).

In response to a query from Jean, and in time to provoke discussion at the CSCS User assembly on Friday where I believe you will meet, here's a brief synopsis of my H5Part related experiences, both recently and further back.

I have been using H5Part for some time now and have developed a reader and writer class for vtk/paraview which I use on a daily basis. I have also developed converters which allow me to convert virtually any ASCII file (which is common for the SPH community since mostly they are in their relative infancy and still do tests on small numbers of particles with ASCII IO) into H5Part. Full details can be found on the pv-meshless wiki in at https://twiki.cscs.ch/twiki/bin/view/ParaViewMeshless in the section on Data formats.

On the whole I have had no problems with H5Part and find it a convenient library to use. I really only use the open file and open time step (group) functions within the library as I have implemented most of the hyperslab selection stuff myself within vtk reader/writer classes. I have not yet looked at H5Block - and I do not see myself doing so as I have managed to store my volume data using hdf5/Xdmf calls and then using the Xdmf Readers within vtk/paraview to read the data (subject to some fixes I made and am in the process of extending).

My most recent tests were at EDF in France, where I had access to the BlueGene machine. They converted their IO to dump data in HDF5 files, but did not use H5Part, instead used an H5T_COMPOUND structure to write all data out in a single call. I modified their IO to use an H5Part form (for my convenience/compatibility), but we quickly found that performance was very poor in comparison. After some further testing last week, I discovered that my stripped down test was setting the collective IO flag always false, so the expected speed up never happened. Having fixed this, the speed difference between compound and H5Part dropped to a factor of about 2. I set a number of tests running on BG using 1,2,3....20 scalar arrays, on 1000,5000,10000,50000,100000,500000,1E6,5E6,1E7,5E7 particles using collective/independent and running on 32,64,128,256,512,1024,2048 processors - giving a total of 10*7*20*2 combinations of timings (might have got these numbers wrong as I'm writing from memory, and some of the larger particle write tests failed and were skipped). The timing results are sitting on BG at montpellier and I'm awaiting an IBM engineer to send them to me as I cannot acccess the machine from here. I expect the results to maintain a 2:1 difference (or thereabouts), but I'll compile a full document when I have the data. If the results prove interesting enough and worth discussing, then I will try writing a short paper to submit somewhere with observations about efficient data IO and strategies etc. I will rerun tests with different cache options and other configuration tweaks as and when I can.

Faced with the choice of implementing IO using H5Part - or using H5Compound type with a factor of 2 speed difference. And based on my estimate of a few days work to implement a reader for paraview to support the new type. We decided to use the new format instead of H5Part for further IO (writes). This decision can be easily changed by simply swapping the IO calls in their SPH code to use the H5Part compatible version should we find that timings do in fact come out in favour of H5Part. Typically they anticipate using anything from 1000 to 100,000 particles per processor, but using very large numbers of processors - the tests I performed were designed to excercise this pattern - which I suspect differs somewhat from the anticipated use cases of PSI/and their colleagues etc. Now that I have a working reader for the new H5 Compound data format (which I refer to as H5SPH), I will put together a set of code snippets that other SPH users can use for their IO.

In fact, using H5Part style interface for OpenFile, SetTimeStep, SetNumberOfParticles, SetView, is basically the same for H5Part and H5SPH, but these 4 functions themselves are not much work to implement, so whilst its a shame to redo this work for an alternative format/library, it isn't actually much work.

As mentioned, I have also been using Xdmf format to store volume blocks of data. Xdmf is simply an hdf5 file for heavy data, and xml for light data. This has proven to be very flexible, and I am considering adopting it as a wrapper around all my hdf5 based data (including H5Part) - implying that I could probably read H5Part formatted particle data using the XdmfReader (though I have not actually tried this - it would require the generation of an xml wrapper for H5Part files, but then the vtkH5PartReader would not be needed and all maintenance could be shifter to one place). I'm not sure at the moment if Xdmf is capable of wrapping Compound data types as used by the new format, so I still have too many readers and too many formats, I will be looking into this during the coming weeks.

One reason for liking Xdmf is that within Xdmf it is simple to store N timesteps of data in one hdf5 file, the next N in another, etc etc. One issue I've had with H5Part is that file sizes keep growing, and limiting individual files to around 50GB is convenient for us. So (say) 50 time steps in one file, 50 in the next, etc etc is useful. It would of course be quite simple to add this functionality to the H5Part libraries (and perhaps is already in there), but since I am already a heavy vtk/paraview and now xdmf user with commit priviliges to these repositories, it makes my life easier to focus now on Xdmf. I therefore see myself moving towards Xdmf in the longer term as it allows a greater variety of storage forms and flexibility. I will not stop using H5Part for existing data and will continue to follow the developments of the extra features that keep going in...but for now, It already does all that I need and needs no real improvement. For our partners who are now switching to much bigger simulations, I keenly await the timing results from BG and the opportunity to run more tests which should also include timings for reading data back into visualization or other post processing software, which will usually occur on fewer processors.

-------------

To summarize : H5part works very well for all the data I already have and any new stuff that comes my way. H5SPH may be used by future big data generators and I will gradually shift to using this myself if I get more H5SPH data than H5Part. For other hdf data types, I will focus on Xdmf so that as much as possible code can be unified into a single package with hopefully less maintenance.

Disclaimer : I have not explored some of the recent 'features' or developments for extracting subsets of data in H5Part, however, I will look into this as and when users have requests could benefit from it/them.

JB

--

John Biddiscombe, email:biddisco @ cscs.ch

http://www.cscs.ch/about/BJohn.php

CSCS, Swiss National Supercomputing Centre | Tel: +41 (91) 610.82.07

Via Cantonale, 6928 Manno, Switzerland | Fax: +41 (91) 610.82.82

-- 
John Biddiscombe,                            email:biddisco @ cscs.ch
http://www.cscs.ch/about/BJohn.php
CSCS, Swiss National Supercomputing Centre  | Tel:  +41 (91) 610.82.07
Via Cantonale, 6928 Manno, Switzerland      | Fax:  +41 (91) 610.82.82

[H5part] [Fwd: Re: [Fwd: Re: H5Part]], Andreas Adelmann, 04/11/2008
- Re: [H5part] [Fwd: Re: [Fwd: Re: H5Part]], John Biddiscombe, 04/11/2008