Skip to Content.
Sympa Menu

h5part - Re: [H5part] memory leak with h5pt_openw_par

h5part AT lists.psi.ch

Subject: H5Part development and discussion

List archive

Re: [H5part] memory leak with h5pt_openw_par


Chronological Thread 
  • From: Mohamad Chaarawi <chaarawi AT hdfgroup.org>
  • To: Quincey Koziol <koziol AT hdfgroup.org>
  • Cc: Mark Howison <mhowison AT brown.edu>, Stefan Adami <stefan.adami AT tum.de>, h5part AT lists.psi.ch
  • Subject: Re: [H5part] memory leak with h5pt_openw_par
  • Date: Fri, 18 May 2012 15:37:47 -0500
  • List-archive: <https://lists.web.psi.ch/pipermail/h5part/>
  • List-id: H5Part development and discussion <h5part.lists.psi.ch>

Hi all,

the fix for this memory leak is in the HDF5 trunk and 1.8 branch now.
I attached a patch also if you would like to apply it to another branch, but I'm not sure if it would cause other conflicts.

Thanks,
Mohamad

On 5/9/2012 11:21 AM, Mohamad Chaarawi wrote:
Hi Quincey,

On 05/09/2012 09:34 AM, Quincey Koziol wrote:
Hi Mohamad,

On May 9, 2012, at 9:26 AM, Mohamad Chaarawi wrote:

Hi All,

I'm guessing that H5MM_xstrdup is returning a malloc'd string to H5P_remove that isn't being free'd.
The string is free'd again in H5P_close, so I don't see why a memory leak would be possible there.
Maybe the property list is internal to the h5part code and isn't being freed there?

Ah nevermind.. I see the problem now... It's an HDF5 internal issue on how properties are inserted/removed when doing collective I/O. I'll talk to Quincey off list and get a fix.

Thanks,
Mohamad


Mark, is that possible?

Is there a short version of the code that I could use to work with and replicate the problem, because just looking at the internal HDF5 library doesn't get me anywhere :-)

Quincey,
Could this be related to the memory leak discovered in the derived datatype construction which you fixed and checked in the trunk and the 1_8_9 release branch?
Hmm, probably not...

Quincey

Thanks,
Mohamad

Thanks, Mark On May 2, 2012, at 7:47 AM, Stefan Adami wrote:
Dear H5Part-Developers,

I am using H5Part for my SPH simulation output and found recently a
problem when writing in parallel. To be more precise, I get a memory
leak when writing things in parallel. I checked the test-code you
provide in the test-folder and when adding the preproc-flag
-DPARALLEL_IO to the Makefile (by default --enable-parallel does not set
this correctly, the testf.f90 is actually compiled in serial version) I
can reproduce this memory leak:

23 bytes in 1 blocks are definitely lost in loss record 13 of 20
==9990== at 0x4024F20: malloc (vg_replace_malloc.c:236)
==9990== by 0x811770C: H5MM_xstrdup (H5MM.c:170)
==9990== by 0x8153A47: H5P_remove (H5Pint.c:3932)
==9990== by 0x80BB1E3: H5FD_mpi_teardown_collective (H5FDmpi.c:525)
==9990== by 0x808F664: H5D_inter_collective_io (H5Dmpio.c:1511)
==9990== by 0x808DBA6: H5D_contig_collective_write (H5Dmpio.c:532)
==9990== by 0x808C06E: H5Dwrite (H5Dio.c:265)
==9990== by 0x804EF03: H5PartWriteDataFloat64 (H5Part.c:1134)
==9990== by 0x804C9F8: h5pt_writedata_r8_ (H5PartF.c:501)
==9990== by 0x804C196: MAIN__ (testf.F90:63)
==9990== by 0x804B563: main
(in /scratch/adami/source/ppm/work/H5Part-1.6.6/test/test

==9990== LEAK SUMMARY:
==9990== definitely lost: 270 bytes in 12 blocks
==9990== indirectly lost: 0 bytes in 0 blocks
==9990== possibly lost: 0 bytes in 0 blocks
==9990== still reachable: 5,564 bytes in 8 blocks
==9990== suppressed: 0 bytes in 0 blocks
==9990==
==9990== For counts of detected and suppressed errors, rerun with: -v
==9990== Use --track-origins=yes to see where uninitialised values come
from
==9990== ERROR SUMMARY: 1060 errors from 191 contexts (suppressed: 29
from 10)

Is there a special compiler-adjustment required or anything else the
user should take care of? I tested this test-code on two different
machines with a 32 and 64bit system with the same result...
Do you have any ideas or are you aware of this leak?

Thank you very much for any help,

Stefan

--
________________________________________
Dipl.-Ing. Stefan Adami
Lehrstuhl für Aerodynamik und Strömungsmechanik
Technische Universität München
Boltzmannstr. 15, 85748 Garching
Tel.: +49-89-289-16122
Fax.: +49-89-289-16139
http://www.aer.mw.tum.de

_______________________________________________
H5Part mailing list
H5Part AT lists.psi.ch
https://lists.web.psi.ch/mailman/listinfo/h5part

_______________________________________________
H5Part mailing list
H5Part AT lists.psi.ch
https://lists.web.psi.ch/mailman/listinfo/h5part

Index: test/tgenprop.c
===================================================================
--- test/tgenprop.c (revision 22374)
+++ test/tgenprop.c (revision 22379)
@@ -1575,6 +1575,46 @@

/****************************************************************
**
+** test_genprop_list_add_remove_prop(): Test adding then removing the
+** same properties to a standard HDF5 property list. This is testing
+** also for a memory leak that could be caused by not freeing the
+** removed property resources from the property list.
+**
+****************************************************************/
+static void
+test_genprop_list_add_remove_prop(void)
+{
+ hid_t pid; /* Property List ID */
+ herr_t ret; /* Generic return value */
+
+ /* Create a dataset creation property list */
+ pid = H5Pcreate(H5P_DATASET_CREATE);
+ CHECK(pid, FAIL, "H5Pcreate");
+
+ /* Insert temporary property into class (with no callbacks) */
+ ret = H5Pinsert2(pid, PROP1_NAME, PROP1_SIZE, PROP1_DEF_VALUE, NULL,
NULL, NULL, NULL, NULL, NULL);
+ CHECK_I(ret, "H5Pinsert2");
+
+ /* Delete added property */
+ ret = H5Premove(pid, PROP1_NAME);
+ CHECK_I(ret, "H5Premove");
+
+ /* Insert temporary property into class (with no callbacks) */
+ ret = H5Pinsert2(pid, PROP1_NAME, PROP1_SIZE, PROP1_DEF_VALUE, NULL,
NULL, NULL, NULL, NULL, NULL);
+ CHECK_I(ret, "H5Pinsert2");
+
+ /* Delete added property */
+ ret = H5Premove(pid, PROP1_NAME);
+ CHECK_I(ret, "H5Premove");
+
+ /* Close property list */
+ ret = H5Pclose(pid);
+ CHECK(ret, FAIL, "H5Pclose");
+
+} /* end test_genprop_list_add_remove_prop() */
+
+/****************************************************************
+**
** test_genprop_equal(): Test basic generic property list code.
** More tests for H5Pequal()
**
@@ -1990,6 +2030,8 @@
test_genprop_list_addprop(); /* Test adding properties to HDF5
property list */
test_genprop_class_addprop(); /* Test adding properties to HDF5
property class */

+ test_genprop_list_add_remove_prop(); /* Test adding and removing the
same property several times to HDF5 property list */
+
test_genprop_equal(); /* Tests for more H5Pequal verification */
test_genprop_path(); /* Tests for class path verification */
test_genprop_refcount(); /* Tests for class reference counting */
Index: src/H5Pdxpl.c
===================================================================
--- src/H5Pdxpl.c (revision 22374)
+++ src/H5Pdxpl.c (revision 22379)
@@ -120,8 +120,13 @@
#define H5D_XFER_XFORM_DEL H5P_dxfr_xform_del
#define H5D_XFER_XFORM_COPY H5P_dxfr_xform_copy
#define H5D_XFER_XFORM_CLOSE H5P_dxfr_xform_close
+/* Definitions for memory MPI type property */
+#define H5FD_MPI_XFER_MEM_MPI_TYPE_SIZE sizeof(MPI_Datatype)
+#define H5FD_MPI_XFER_MEM_MPI_TYPE_DEF MPI_DATATYPE_NULL
+/* Definitions for file MPI type property */
+#define H5FD_MPI_XFER_FILE_MPI_TYPE_SIZE sizeof(MPI_Datatype)
+#define H5FD_MPI_XFER_FILE_MPI_TYPE_DEF MPI_DATATYPE_NULL

-
/******************/
/* Local Typedefs */
/******************/
@@ -208,6 +213,8 @@
unsigned def_mpio_chunk_opt_ratio = H5D_XFER_MPIO_CHUNK_OPT_RATIO_DEF;
H5D_mpio_actual_chunk_opt_mode_t def_mpio_actual_chunk_opt_mode =
H5D_MPIO_ACTUAL_CHUNK_OPT_MODE_DEF;
H5D_mpio_actual_io_mode_t def_mpio_actual_io_mode =
H5D_MPIO_ACTUAL_IO_MODE_DEF;
+ MPI_Datatype btype = H5FD_MPI_XFER_MEM_MPI_TYPE_DEF; /* Default value
for MPI buffer type */
+ MPI_Datatype ftype = H5FD_MPI_XFER_FILE_MPI_TYPE_DEF; /* Default value
for MPI file type */
#endif /* H5_HAVE_PARALLEL */
H5Z_EDC_t enable_edc = H5D_XFER_EDC_DEF; /* Default value for
EDC property */
H5Z_cb_t filter_cb = H5D_XFER_FILTER_CB_DEF; /* Default value for
filter callback */
@@ -285,6 +292,17 @@
/* Register the actual io mode property. */
if(H5P_register_real(pclass, H5D_MPIO_ACTUAL_IO_MODE_NAME,
H5D_MPIO_ACTUAL_IO_MODE_SIZE, &def_mpio_actual_io_mode, NULL, NULL, NULL,
NULL, NULL, NULL, NULL) < 0)
HGOTO_ERROR(H5E_PLIST, H5E_CANTINSERT, FAIL, "can't insert property
into class")
+
+ /* Register the MPI memory type property */
+ if(H5P_register_real(pclass, H5FD_MPI_XFER_MEM_MPI_TYPE_NAME,
H5FD_MPI_XFER_MEM_MPI_TYPE_SIZE,
+ &btype, NULL, NULL, NULL, NULL, NULL, NULL, NULL) <
0)
+ HGOTO_ERROR(H5E_PLIST, H5E_CANTINSERT, FAIL, "can't insert property
into class")
+
+ /* Register the MPI file type property */
+ if(H5P_register_real(pclass, H5FD_MPI_XFER_FILE_MPI_TYPE_NAME,
H5FD_MPI_XFER_FILE_MPI_TYPE_SIZE,
+ &ftype, NULL, NULL, NULL, NULL, NULL, NULL, NULL) <
0)
+ HGOTO_ERROR(H5E_PLIST, H5E_CANTINSERT, FAIL, "can't insert property
into class")
+
#endif /* H5_HAVE_PARALLEL */

/* Register the EDC property */
Index: src/H5FDmpi.c
===================================================================
--- src/H5FDmpi.c (revision 22374)
+++ src/H5FDmpi.c (revision 22379)
@@ -466,7 +466,7 @@
*-------------------------------------------------------------------------
*/
herr_t
-H5FD_mpi_setup_collective(hid_t dxpl_id, MPI_Datatype btype, MPI_Datatype
ftype)
+H5FD_mpi_setup_collective(hid_t dxpl_id, MPI_Datatype *btype, MPI_Datatype
*ftype)
{
H5P_genplist_t *plist; /* Property list pointer */
herr_t ret_value=SUCCEED; /* Return value */
@@ -478,56 +478,15 @@
HGOTO_ERROR(H5E_PLIST, H5E_BADTYPE, FAIL, "not a dataset transfer
list")

/* Set buffer MPI type */
-
if(H5P_insert(plist,H5FD_MPI_XFER_MEM_MPI_TYPE_NAME,H5FD_MPI_XFER_MEM_MPI_TYPE_SIZE,&btype,NULL,NULL,NULL,NULL,NULL,NULL)<0)
+ if(H5P_set(plist, H5FD_MPI_XFER_MEM_MPI_TYPE_NAME, btype) < 0)
HGOTO_ERROR(H5E_PLIST, H5E_CANTSET, FAIL, "can't insert MPI-I/O
property")

- /* Set file MPI type */
-
if(H5P_insert(plist,H5FD_MPI_XFER_FILE_MPI_TYPE_NAME,H5FD_MPI_XFER_FILE_MPI_TYPE_SIZE,&ftype,NULL,NULL,NULL,NULL,NULL,NULL)<0)
+ /* Set File MPI type */
+ if(H5P_set(plist, H5FD_MPI_XFER_FILE_MPI_TYPE_NAME, ftype) < 0)
HGOTO_ERROR(H5E_PLIST, H5E_CANTSET, FAIL, "can't insert MPI-I/O
property")

done:
FUNC_LEAVE_NOAPI(ret_value)
} /* end H5FD_mpi_setup_collective() */

-
-/*-------------------------------------------------------------------------
- * Function: H5FD_mpi_teardown_collective
- *
- * Purpose: Remove the temporary MPI-I/O properties from dxpl.
- *
- * Return: Success: Non-negative
- * Failure: Negative
- *
- * Programmer: Quincey Koziol
- * Monday, June 17, 2002
- *
- * Modifications:
- *
- *-------------------------------------------------------------------------
- */
-herr_t
-H5FD_mpi_teardown_collective(hid_t dxpl_id)
-{
- H5P_genplist_t *plist; /* Property list pointer */
- herr_t ret_value=SUCCEED; /* Return value */
-
- FUNC_ENTER_NOAPI(FAIL)
-
- /* Check arguments */
- if(NULL == (plist = H5P_object_verify(dxpl_id,H5P_DATASET_XFER)))
- HGOTO_ERROR(H5E_PLIST, H5E_BADTYPE, FAIL, "not a dataset transfer
list")
-
- /* Remove buffer MPI type */
- if(H5P_remove(dxpl_id,plist,H5FD_MPI_XFER_MEM_MPI_TYPE_NAME)<0)
- HGOTO_ERROR(H5E_PLIST, H5E_CANTDELETE, FAIL, "can't remove MPI-I/O
property")
-
- /* Remove file MPI type */
- if(H5P_remove(dxpl_id,plist,H5FD_MPI_XFER_FILE_MPI_TYPE_NAME)<0)
- HGOTO_ERROR(H5E_PLIST, H5E_CANTDELETE, FAIL, "can't remove MPI-I/O
property")
-
-done:
- FUNC_LEAVE_NOAPI(ret_value)
-} /* end H5FD_mpi_teardown_collective() */
-
#endif /* H5_HAVE_PARALLEL */
-
Index: src/H5FDmpi.h
===================================================================
--- src/H5FDmpi.h (revision 22374)
+++ src/H5FDmpi.h (revision 22379)
@@ -81,10 +81,8 @@
/* ======== Temporary data transfer properties ======== */
/* Definitions for memory MPI type property */
#define H5FD_MPI_XFER_MEM_MPI_TYPE_NAME "H5FD_mpi_mem_mpi_type"
-#define H5FD_MPI_XFER_MEM_MPI_TYPE_SIZE sizeof(MPI_Datatype)
/* Definitions for file MPI type property */
#define H5FD_MPI_XFER_FILE_MPI_TYPE_NAME "H5FD_mpi_file_mpi_type"
-#define H5FD_MPI_XFER_FILE_MPI_TYPE_SIZE sizeof(MPI_Datatype)

/*
* The view is set to this value
@@ -105,9 +103,8 @@
H5_DLL herr_t H5FD_mpio_wait_for_left_neighbor(H5FD_t *file);
H5_DLL herr_t H5FD_mpio_signal_right_neighbor(H5FD_t *file);
#endif /* NOT_YET */
-H5_DLL herr_t H5FD_mpi_setup_collective(hid_t dxpl_id, MPI_Datatype btype,
- MPI_Datatype ftype);
-H5_DLL herr_t H5FD_mpi_teardown_collective(hid_t dxpl_id);
+H5_DLL herr_t H5FD_mpi_setup_collective(hid_t dxpl_id, MPI_Datatype *btype,
+ MPI_Datatype *ftype);

/* Driver specific methods */
H5_DLL int H5FD_mpi_get_rank(const H5FD_t *file);
Index: src/H5Dmpio.c
===================================================================
--- src/H5Dmpio.c (revision 22374)
+++ src/H5Dmpio.c (revision 22379)
@@ -1542,15 +1542,13 @@
H5D__final_collective_io(H5D_io_info_t *io_info, const H5D_type_info_t
*type_info,
hsize_t mpi_buf_count, MPI_Datatype *mpi_file_type, MPI_Datatype
*mpi_buf_type)
{
- hbool_t plist_is_setup = FALSE; /* Whether the dxpl has been
customized */
herr_t ret_value = SUCCEED;

FUNC_ENTER_STATIC

/* Pass buf type, file type to the file driver. */
- if(H5FD_mpi_setup_collective(io_info->dxpl_id, *mpi_buf_type,
*mpi_file_type) < 0)
+ if(H5FD_mpi_setup_collective(io_info->dxpl_id, mpi_buf_type,
mpi_file_type) < 0)
HGOTO_ERROR(H5E_PLIST, H5E_CANTSET, FAIL, "can't set MPI-I/O
properties")
- plist_is_setup = TRUE;

if(io_info->op_type == H5D_IO_OP_WRITE) {
if((io_info->io_ops.single_write)(io_info, type_info, mpi_buf_count,
NULL, NULL) < 0)
@@ -1562,11 +1560,6 @@
} /* end else */

done:
- /* Reset the dxpl settings */
- if(plist_is_setup)
- if(H5FD_mpi_teardown_collective(io_info->dxpl_id) < 0)
- HDONE_ERROR(H5E_DATASPACE, H5E_CANTFREE, FAIL, "unable to reset
dxpl values")
-
#ifdef H5D_DEBUG
if(H5DEBUG(D))
HDfprintf(H5DEBUG(D),"ret_value before leaving
final_collective_io=%d\n",ret_value);
Index: src/H5Pint.c
===================================================================
--- src/H5Pint.c (revision 22374)
+++ src/H5Pint.c (revision 22379)
@@ -2314,9 +2314,13 @@

/* Check if the property has been deleted */
if(NULL != H5SL_search(plist->del, name)) {
+ char *temp_name = NULL;
/* Remove the property name from the deleted property skip list */
- if(NULL == H5SL_remove(plist->del, name))
+ if(NULL == (temp_name = H5SL_remove(plist->del, name)))
HGOTO_ERROR(H5E_PLIST,H5E_CANTDELETE,FAIL,"can't remove property
from deleted skip list")
+
+ /* free the name of the removed property */
+ H5MM_xfree(temp_name);
} /* end if */
else {
H5P_genclass_t *tclass; /* Temporary class pointer */
Index: release_docs/RELEASE.txt
===================================================================
--- release_docs/RELEASE.txt (revision 22374)
+++ release_docs/RELEASE.txt (revision 22379)
@@ -100,7 +100,8 @@

Library
-------
- - None
+ - Fix a memory leak exposed when inserting/removing a property
+ from a property list several times. HDFFV-8022. (MSC 2012/05/18)

Parallel Library
----------------



Archive powered by MHonArc 2.6.19.

Top of Page