h5part AT lists.psi.ch
Subject: H5Part development and discussion
List archive
- From: Andreas Adelmann <andreas.adelmann AT psi.ch>
- To: h5part AT lists.psi.ch
- Cc: mike folk <mfolk AT ncsa.uiuc.edu>, Elena Pourmal <epourmal AT hdfgroup.org>
- Subject: [H5part] H5Part & Performance
- Date: Thu, 10 Apr 2008 13:08:14 +0200
- List-archive: <https://lists.web.psi.ch/pipermail/h5part/>
- List-id: H5Part development and discussion <h5part.lists.psi.ch>
Dear colleagues please check out the interesting observations made by John Biddiscombe. We should discuss this issue! John agreed to talk at our BD-Palaver sometimes early in summer. Best Andreas =================== START MAIL OF John Biddiscombe =================== Further to my earlier message I forgot to write about one additional consideration that came up whilst working at EDF. The particles are divided into a number of categories. Boundary Particles - static Boundary particles which can move (and generally impart a force onto fluid particles) Fluid particles Other particles (suchas air in multi material simulations) And Fluid2, fluid3, fluid4 depending on other categorizations... the division of types is arbitrary, but EDF, Manchester, Nantes, Plymouth and others that I can't recall right now all store a flag (int) which indicates particle type. For 1E6 particles, at N time steps, this extra flag contributes IO and disk space and serves little purpose - especially as when reading the data, we have to scan the particles, extract all the type1,, type2, type3 etc and create separate datasets from them when we do certain renderings. Much better to store the particles in groups, much like a multi-block dataset. Using the compound type we can easily create a group or a file representing a time step and write multiple datasets - one for each block. Within H5Part one could do the same by using extra sub-groups within the timestep group (though it might complicate the reading a bit). Anyway, since I've written so much already, I though I'd add this extra observation. JB ----------------------------- Jean, Andreas, Achim, please forward to others if necessary. (I have not posted this to the H5Part list, but please do so if you think it is of interest to other readers). In response to a query from Jean, and in time to provoke discussion at the CSCS User assembly on Friday where I believe you will meet, here's a brief synopsis of my H5Part related experiences, both recently and further back. I have been using H5Part for some time now and have developed a reader and writer class for vtk/paraview which I use on a daily basis. I have also developed converters which allow me to convert virtually any ASCII file (which is common for the SPH community since mostly they are in their relative infancy and still do tests on small numbers of particles with ASCII IO) into H5Part. Full details can be found on the pv-meshless wiki in at https://twiki.cscs.ch/twiki/bin/view/ParaViewMeshless in the section on Data formats. On the whole I have had no problems with H5Part and find it a convenient library to use. I really only use the open file and open time step (group) functions within the library as I have implemented most of the hyperslab selection stuff myself within vtk reader/writer classes. I have not yet looked at H5Block - and I do not see myself doing so as I have managed to store my volume data using hdf5/Xdmf calls and then using the Xdmf Readers within vtk/paraview to read the data (subject to some fixes I made and am in the process of extending). My most recent tests were at EDF in France, where I had access to the BlueGene machine. They converted their IO to dump data in HDF5 files, but did not use H5Part, instead used an H5T_COMPOUND structure to write all data out in a single call. I modified their IO to use an H5Part form (for my convenience/compatibility), but we quickly found that performance was very poor in comparison. After some further testing last week, I discovered that my stripped down test was setting the collective IO flag always false, so the expected speed up never happened. Having fixed this, the speed difference between compound and H5Part dropped to a factor of about 2. I set a number of tests running on BG using 1,2,3....20 scalar arrays, on 1000,5000,10000,50000,100000,500000,1E6,5E6,1E7,5E7 particles using collective/independent and running on 32,64,128,256,512,1024,2048 processors - giving a total of 10*7*20*2 combinations of timings (might have got these numbers wrong as I'm writing from memory, and some of the larger particle write tests failed and were skipped). The timing results are sitting on BG at montpellier and I'm awaiting an IBM engineer to send them to me as I cannot acccess the machine from here. I expect the results to maintain a 2:1 difference (or thereabouts), but I'll compile a full document when I have the data. If the results prove interesting enough and worth discussing, then I will try writing a short paper to submit somewhere with observations about efficient data IO and strategies etc. I will rerun tests with different cache options and other configuration tweaks as and when I can. Faced with the choice of implementing IO using H5Part - or using H5Compound type with a factor of 2 speed difference. And based on my estimate of a few days work to implement a reader for paraview to support the new type. We decided to use the new format instead of H5Part for further IO (writes). This decision can be easily changed by simply swapping the IO calls in their SPH code to use the H5Part compatible version should we find that timings do in fact come out in favour of H5Part. Typically they anticipate using anything from 1000 to 100,000 particles per processor, but using very large numbers of processors - the tests I performed were designed to excercise this pattern - which I suspect differs somewhat from the anticipated use cases of PSI/and their colleagues etc. Now that I have a working reader for the new H5 Compound data format (which I refer to as H5SPH), I will put together a set of code snippets that other SPH users can use for their IO. In fact, using H5Part style interface for OpenFile, SetTimeStep, SetNumberOfParticles, SetView, is basically the same for H5Part and H5SPH, but these 4 functions themselves are not much work to implement, so whilst its a shame to redo this work for an alternative format/library, it isn't actually much work. As mentioned, I have also been using Xdmf format to store volume blocks of data. Xdmf is simply an hdf5 file for heavy data, and xml for light data. This has proven to be very flexible, and I am considering adopting it as a wrapper around all my hdf5 based data (including H5Part) - implying that I could probably read H5Part formatted particle data using the XdmfReader (though I have not actually tried this - it would require the generation of an xml wrapper for H5Part files, but then the vtkH5PartReader would not be needed and all maintenance could be shifter to one place). I'm not sure at the moment if Xdmf is capable of wrapping Compound data types as used by the new format, so I still have too many readers and too many formats, I will be looking into this during the coming weeks. One reason for liking Xdmf is that within Xdmf it is simple to store N timesteps of data in one hdf5 file, the next N in another, etc etc. One issue I've had with H5Part is that file sizes keep growing, and limiting individual files to around 50GB is convenient for us. So (say) 50 time steps in one file, 50 in the next, etc etc is useful. It would of course be quite simple to add this functionality to the H5Part libraries (and perhaps is already in there), but since I am already a heavy vtk/paraview and now xdmf user with commit priviliges to these repositories, it makes my life easier to focus now on Xdmf. I therefore see myself moving towards Xdmf in the longer term as it allows a greater variety of storage forms and flexibility. I will not stop using H5Part for existing data and will continue to follow the developments of the extra features that keep going in...but for now, It already does all that I need and needs no real improvement. For our partners who are now switching to much bigger simulations, I keenly await the timing results from BG and the opportunity to run more tests which should also include timings for reading data back into visualization or other post processing software, which will usually occur on fewer processors. ------------- To summarize : H5part works very well for all the data I already have and any new stuff that comes my way. H5SPH may be used by future big data generators and I will gradually shift to using this myself if I get more H5SPH data than H5Part. For other hdf data types, I will focus on Xdmf so that as much as possible code can be unified into a single package with hopefully less maintenance. Disclaimer : I have not explored some of the recent 'features' or developments for extracting subsets of data in H5Part, however, I will look into this as and when users have requests could benefit from it/them. JB --
John Biddiscombe, email:biddisco @ cscs.ch http://www.cscs.ch/about/BJohn.php CSCS, Swiss National Supercomputing Centre | Tel: +41 (91) 610.82.07 Via Cantonale, 6928 Manno, Switzerland | Fax: +41 (91) 610.82.82 =================== END MAIL OF John Biddiscombe =================== -- Dr. sc. math. Andreas (Andy) Adelmann Staff Scientist Paul Scherrer Institut WLGB/132 CH-5232 Villigen PSI Phone Office: xx41 56 310 42 33 Fax: xx41 56 310 31 91 Phone Home: xx41 62 891 91 44 ------------------------------------------------------- Wednesday: ETH CAB F 10.1 xx41 44 632 82 76 ======================================================= The more exotic, the more abstract the knowledge, the more profound will be its consequences." Leon Lederman ======================================================= |
- [H5part] H5Part & Performance, Andreas Adelmann, 04/10/2008
Archive powered by MHonArc 2.6.19.