Description of problem: The new mem-pools have a new "pool_sweeper" thread that cleans up the cold and hot lists of unallocated objects. This thread is not started for gfapi applications or when libgfchangelog is used. Not having this thread running prevents memory to be freed from the mem-pools. Version-Release number of selected component (if applicable): rhgs-3.3.0 How reproducible: 100% Steps to Reproduce: 1. call mem_get() many times, and mem_put() as many times 2. notice that the memory consumption does not reduce Actual results: Memory consumption peaks and does not reduce (for mem-pool allocations) Expected results: Memory should be free'd once the "pool_sweeper" thread goes through the objects that are on the cold list. Additional info: This is similar to bug 1470170 where the memory from mem-pools is not released when mem_pools_fini() is called.
Patches that need backporting (if missing): - libglusterfs: add mem_pools_fini https://review.gluster.org/17662 - gfapi: add mem_pools_init and mem_pools_fini calls https://review.gluster.org/17666 - gfapi+libglusterfs: fix mem_pools_fini without mem_pools_init case https://review.gluster.org/17728 - gfapi: prevent mem-pool leak in case glfs_new_fs() fails https://review.gluster.org/17734 - mem-pool: initialize pthread_key_t pool_key in mem_pool_init_early() https://review.gluster.org/17779 - mem-pool: track and verify initialization state https://review.gluster.org/17915 - changelog: add mem-pool initialization https://review.gluster.org/17900
Although comment #4 lists all 7 patches that prevent resource leaks related to the starting and stopping of the pool_sweeper thread, these three are the ones that Kaleb backported for testing in bug 1461543 and made most difference related to memory consumption: https://review.gluster.org/17662 https://review.gluster.org/17666 https://review.gluster.org/17728 The others are enhancements to this and address the leaks in libgfchangelog and cli too.
Patches mentioned in comment 7 are already in 3.12 branch. Downstream patches: https://code.engineering.redhat.com/gerrit/#/c/114349/ https://code.engineering.redhat.com/gerrit/#/c/114350/ https://code.engineering.redhat.com/gerrit/#/c/114351/
Verified this bug on # rpm -qa | grep ganesha glusterfs-ganesha-3.8.4-38.el7rhgs.x86_64 nfs-ganesha-2.4.4-16.el7rhgs.x86_64 nfs-ganesha-gluster-2.4.4-16.el7rhgs.x86_64 To validate this,Ran some manual/automation cases around HA,root-squash,ACLs,multi volume cases.Moving this bug to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2774