h5part AT lists.psi.ch

Subject: H5Part development and discussion

List archive

Re: [H5part] SVN access available for contributions?

From: John Biddiscombe <biddisco AT cscs.ch>
To: John Shalf <jshalf AT lbl.gov>
Cc: h5part AT lists.psi.ch
Subject: Re: [H5part] SVN access available for contributions?
Date: Thu, 25 Jan 2007 10:42:55 +0100
List-archive: <https://lists.web.psi.ch/pipermail/h5part/>
List-id: H5Part development and discussion <h5part.lists.psi.ch>

John

So I assume you want a convenience routine that can reassemble a list of arrays (stored on disk as scalar fields) and interlace them as vectors in memory. I assume you are *not* storing the data as N-component vector fields in the file though (am I correct?). If you've already done that, then it would be an excellent addition to the API. But we definitely want the disk image of the fields to be distinct vectors for various reasons.

So, to see if I understand this correctly, you would like a convenience function that allows you to specify a vector of dataset names (rather than a single name) and it would naturally interlace them in memory upon read? Is that the request? I think that should be pretty reasonable as well given HDF's support for strided memory spaces. Do you have a proposed appearance for this API? (something that doesn't use var_args since it would complicate the F90 bindings). Overall, this is also quite reasonable provided the on-disk image has them laid out as scalars (they can be reconstituted in memory as vectors).

I have already implemented the write and read back of N component variables as N single components fields. I've tested it in parallel with the combination of the memory space for the N-tuple arrays in memory and the Dataspace for parallel IO and it's good.
I would like to contribute this to the main H5Part API. At the moment, the writing out is fine, but I am still playing with the read back and the interface is something along these lines
ReadNComponentArray(int NComponents, float/double/etc *dest, char **arrayOfNames)
so the array of names chosen to be read is passed in and assumed to be the same length as the number of components desired.

The current sorting algorithm for the file format will be able to accommodate the different numbering formats you propose, so your proposed change would be backward compatible with existing readers (that's a good thing). So this addition could be implemented as a convention rather than a requirement.

OK. I'm not familiar with existing sorting algorithms in the context of H5Part. I find that I often browse files using NCSA's (I think) HDF5 viewer package and it lists things using a straight alphabetical sort.

We do encourage liberal use of attributes to serve the individual needs of groups though, so you should definitely implement storage of the TimeValue attribute. We should probably document the attributes that various groups have proposed for their own local conventions.

OK.

When I say "convention" I mean additional features that can be used to extend the content of the file format using attributes. When it is a convention, then readers can be coded to look for them for added value, but they should also be prepared for their absence. We are trying to minimize the "requirements" so as to keep the readers as simple as possible.

I do think that a primary dataset group "Name"/prefix should be a requirement. You already have backward imcompatibility between "particles1" ans "steps1" - had this been a requirement previously, then the files would be compatible. ($0.02)

The decision to go with limited type support for the API was for two reasons

understood, I'd still like to add prototypes for the main types commonly supported on all platforms. Not complex user defined structures.

For various reasons, it is better for us to keep each vector component as separate scalar arrays on disk

I'm happy with this and am already doing it.

I did find some other bugs which I'm hoping have been fixed. I had problems when I compiled the code with parallel support, but was not using Parallel IO and put a couple of extra checks in. I also found a bug when the number of particles is dynamic and new mem/data spaces are needed which wasn't handled correctly.

I can see the H5Part repository, but cannot access it. May I please have access so that I can bring my current code base up to date with your svn Head version.

Thanks

JB

[H5part] SVN access available for contributions?, John Biddiscombe, 01/24/2007
- Re: [H5part] SVN access available for contributions?, John Biddiscombe, 01/24/2007
- Re: [H5part] SVN access available for contributions?, John Shalf, 01/25/2007
  - Re: [H5part] SVN access available for contributions?, John Biddiscombe, 01/25/2007
    - Re: [H5part] SVN access available for contributions?, Andreas Adelmann, 01/25/2007
      - Re: [H5part] SVN access available for contributions?, John Biddiscombe, 01/25/2007
    - Re: [H5part] SVN access available for contributions?, John Shalf, 01/26/2007
      - Re: [H5part] SVN access available for contributions?, Achim Gsell, 01/26/2007