Skip to Content.
Sympa Menu

h5part - Re: [H5part] H5Part performance problem

h5part AT lists.psi.ch

Subject: H5Part development and discussion

List archive

Re: [H5part] H5Part performance problem


Chronological Thread 
  • From: Kurt Stockinger <KStockinger AT lbl.gov>
  • Cc: h5part AT lists.psi.ch
  • Subject: Re: [H5part] H5Part performance problem
  • Date: Tue, 05 Dec 2006 09:58:21 -0800
  • List-archive: <https://lists.web.psi.ch/pipermail/h5part/>
  • List-id: H5Part development and discussion <h5part.lists.psi.ch>

Hi Thomas,

Thanks for the detailed code description. It's an excellent way for me
to see how H5Part is currently being used. I have a few comments below.

Thomas Schietinger wrote:
> Dear all,
>
> let me add some more info on the performance problem reported
> at yesterday's phone meeting.
>
> The routines I use for reading in an HDF5 file in H5root are:
> (see file TH5Dataset.cc)
>
> get the file handle:
>
> fH5File = H5PartOpenFile(fFullFilename.Data(),H5PART_READ);
>
> (fFullFilename.Data() evaluates to a char string)
>
> get the number of datasets:
>
> H5PartSetStep(fH5File,0);
> Int_t nDataset = H5PartGetNumDatasets(fH5File);
>
> get the dataset names:
>
> for(Int_t i=0;i<nDataset;i++){
> H5PartGetDatasetName(fH5File,i,name,maxLength);
> ...
> }
>
> get the number of steps, step attributes and file attributes:
>
> fNStep = H5PartGetNumSteps(fH5File);
> fNStepAttr = H5PartGetNumStepAttribs(fH5File);
> fNFileAttr = H5PartGetNumFileAttribs(fH5File);
>
> get file attribute info:
>
> for (Int_t n = 0; n < fNFileAttr; n++) {
> H5PartGetFileAttribInfo(fH5File,n,an,32,0,nElem);
Is it intended that you don't return the type of the attribute (the 5th
parameter) or do you assume that you know the type, i.e. all are of type
char?
> ...
> }
>
> get step attribute info for step 0 - ASSUME THEY WILL BE THE SAME
> FOR ALL STEPS!
>
> H5PartSetStep(fH5File,0);
>
> for (Int_t n=0; n<fNStepAttr; ++n) {
> H5PartGetStepAttribInfo(fH5File,n,an,32,0,nElem);
Same comment about type as above.
> ...
> // distinguish scalar and vector attributes, ignore others:
> TString type("?");
> if (*nElem==1 ) { // scalar variable
> fScalarName.AddLast(new TObjString(attrName));
> fNScalarAttr++;
> type = TString(" (scalar)");
> } else if (*nElem==3 ) { // vector variable
> fVectorName.AddLast(new TObjString(attrName));
> fNVectorAttr++;
> type = TString(" (vector)");
> } else {
> ...
> }
>
> Read in the unit strings for the attributes found:
>
> for (Int_t i = 0; i < 9; i++) fPartVarUnit[i] =
> GetUnit(fPartVarName[i]);
> for (Int_t i = 0; i < fNScalarAttr; i++)
> fScalarUnit.AddLast(new TObjString(GetUnit(static_cast<TObjString*>
> (fScalarName.At(i))->GetString())));
> for (Int_t i = 0; i < fNVectorAttr; i++)
> fVectorUnit.AddLast(new TObjString(GetUnit(static_cast<TObjString*>
> (fVectorName.At(i))->GetString())));
>
> where GetUnit(...) is a method that contains a loop over the file
> attributes to
> find the associated unit name:
In the H5Part doc I didn't find anything specific about unit names. Can
you explain this a bit? Do you assume that the attribute name is unified
to char[32]?
>
> for (Int_t i = 0; i < fNFileAttr; i++) {
>
> TObjString* s = static_cast<TObjString*>(fFileAttr.At(i));
> if (s->GetString() == varNameU) {
> char u[32];
>
> H5PartReadFileAttrib(fH5File,const_cast<char*>(s->GetString().Data()),
> &u);
> unit = TString(u);
> }
> ...
> }
>
> Now loop over steps and retrieve attribute and particle data for each
> step:
>
> for (Int_t step=0; step<fNStep; step++) {
> H5PartSetStep(fH5File,step);
> unsigned long n = H5PartGetNumParticles(fH5File);
>
> // read in scalar attributes
> for (int i=0; i<fNScalarAttr; ++i) {
> TObjString* s = static_cast<TObjString*>(fScalarName.At(i));
>
> H5PartReadStepAttrib(fH5File,const_cast<char*>(s->GetString().Data()),
> &val);
> ...
> }
>
> // read in vector attributes
> for (int i=0; i<fNVectorAttr; ++i) {
> TObjString* s = static_cast<TObjString*>(fVectorName.At(i));
>
> H5PartReadStepAttrib(fH5File,const_cast<char*>(s->GetString().Data()),
> arr);
> ...
> }
>
> Sorry if this is too much information, I just wanted to let you know
> what routines
> I am using, so you can see if I am doing something very inefficient. I
> have not
> given much thought to the choice of routines, I just looked for what I
> needed,
> typically found it rather quickly and then only made sure it does what
> I want it
> to do without measuring performance or anything. The files I look at
> load very
> quickly anyway, it is only when Andreas tries to simulate the whole
> world that
> he ends up waiting a couple of minutes ;-)
>
> Now some numbers: a 3 Giga file with some 500 time steps (21 file
> attr., 12 step
> attr.) takes 19.1 s to read in (the second time only 0.5 s since the
> file is
> buffered).
This the time for reading the attributes only, right? If yes, then the
whole 3GB is not read entirely but only the 21*12 (char) attributes, right?

Can you send me a pointer to the file to download so that we can look at
this together? This would also help me understand your performance a bit
better.

Thanks,
Kurt
> A 85 Giga file with 18263 time steps (same number of attributes)
> takes 29:38.52 to read in (half an hour). That's where it starts to hurt!
> It should be noted that a simple h5dump also groans under that file
> and won't
> produce anything before several minutes (I am still waiting in fact).
> (These figures are for our merlin00 machine at PSI.)
>
> Achim suggested to use H5PartGetStepAttribInfo instead of
> H5PartReadStepAttrib,
> but I don't see how I can replace the functionality of ReadStepAttrib
> with
> GetStepAttribInfo ,i.e. read a value...
>
> Regards,
>
> Thomas
>
> _______________________________________________
> H5Part mailing list
> H5Part AT lists.psi.ch
> https://lists.web.psi.ch/mailman/listinfo/h5part


--
Kurt Stockinger
Computational Research Division
Lawrence Berkeley National Laboratory
Mail Stop 50B-3238, 1 Cyclotron Road
Berkeley, California 94720, USA

Tel: +1 (510) 486 5208, Fax: +1 (510) 486 4004
email: KStockinger AT lbl.gov
http://sdm.lbl.gov/kurts/





Archive powered by MHonArc 2.6.19.

Top of Page