h5part AT lists.psi.ch
Subject: H5Part development and discussion
List archive
- From: John Biddiscombe <biddisco AT cscs.ch>
- To: Andreas Adelmann <andreas.adelmann AT psi.ch>
- Cc: h5part AT lists.psi.ch
- Subject: Re: [H5part] [Fwd: Re: [Fwd: Re: H5Part]]
- Date: Fri, 11 Apr 2008 20:52:33 +0200
- List-archive: <https://lists.web.psi.ch/pipermail/h5part/>
- List-id: H5Part development and discussion <h5part.lists.psi.ch>
Some replies for further discussionNo. At least not in hdf5 1.6 onwards. You can create a compound structure with 20 fields in it, but only read back N of them. It might be the case that internally, hdf is actually reading the lot, but the user doesn't see it. If that is the case, then I agree, it's terrible. So far, reading the data back has not been a problem because a) I haven't got a lot of the compound data yet b) our post processing is not so critical when loading into paraview or other tool. c) I usually read it all back anyway, so I wouldn't notice if I was wasting memory. (note to self : Further tests required.) However, I would like to add, that I may decide not to support this new compound type if further tests show it to be memory wasteful and slow when we use it a lot. I'm not happy that the simulation guys can seve 3 seconds per IO write, if it costs us 5 per read. or worse! In my reader, I simply open the compound type, query the fields, and stuff them into the gui. when loading H5Part, I query the datasets and stuff them into the gui. The actual differnce is pretty small really. When reading the arrays, the code is different, but not so much so (notwithstanding your earlier observation of memory wastage etc) It isn't that hard. Every user can create an compound type and the code will just list the fields in the same way as datasets. No problem. Because I've now got 3 different HDF5 based readers. For different applications. Jean has 3 or 4 more. We have to maintain 7 readers for a file type that is supposed to be standard. (Not all our users are creating particle data). But they do all create big datasets - they just group them in different combinations and sub groups. What xdmf allows me to do is simply list the heavy data in an XML structure, and have a single reader read all the different flavours of hdf5. (ok, not really, but you get the idea). If a user comes along and says I've got arrays stored like this, with vectors like that, and others like this. We should be able to create a template xml description and use the same reader to read it. The logic of what is where, is in the xml decoding, and is done once. Instead of 7 hdf5 readers with logic about which datastructures are stored relative to others, we can do it in one reader. In truth this is not happening. We stil have many readers...but I'm hoping that the xdmf approach will save me time in the long run. I'm still mot completely sold on it, but I hope I've described why I'm looking into it. Only the locations and structure of the file. The real data is still the heavy stuff. Instead of coding the geometry extraction in the reader, you code it in the xml and then let the reader decode - but with the flexibility to handle many different types. True. So far, I'm getting one xml per dataset and one hdf file per timestep so it is't yet a problem. The overhead of reading the geometry description hasn't yet caused trouble. Today I made an animation of 1139 time steps of data that was volumetric, from 1 xml file and 1139 hdf files. The parsing of the xml is not noticable and saves some time, because all the 'information' about spacing, origin, dimensions is in the text file and no h5 calls were made until the heavy data was actually needed. I think using the xml description is actually faster then opening the hdf and finding array dimensions etc. OK. I can't say much about this. true that mismatch will be a problem. So far, my experimentation with xdmf has been a) very painful - many troubles fixing bugs in vtkXdmfReader etc and adding time support b) still in it's infancy. I may yet decide it's more effort than it's worth. c) but ... showing some promise. Having just one reader instead of N is looking attractive to me. Even if the one reader is N times more complicated! JB.
-- John Biddiscombe, email:biddisco @ cscs.ch http://www.cscs.ch/about/BJohn.php CSCS, Swiss National Supercomputing Centre | Tel: +41 (91) 610.82.07 Via Cantonale, 6928 Manno, Switzerland | Fax: +41 (91) 610.82.82 |
- [H5part] [Fwd: Re: [Fwd: Re: H5Part]], Andreas Adelmann, 04/11/2008
- Re: [H5part] [Fwd: Re: [Fwd: Re: H5Part]], John Biddiscombe, 04/11/2008
Archive powered by MHonArc 2.6.19.