+++ This bug was initially created as a clone of Bug #1119894 +++ Description of problem: I am running glusterfs 3.4.2 on linux kernel version 2.6.34.12 on two x86_64 board with 16 GB of RAM each. I have several gluster file-systems (close to 10)in twin-replicated mode containing around 4 GB of data aggregate. Sometimes, following reboot of boards, I observe that glustershd memory % in top output increases above 50% (over 8 GB) causing problems when trying to run other key processes. Version-Release number of selected component (if applicable): glusterfs 3.4.2 linux kernel 2.6.34.12 How reproducible: Intermittent. Our systems reboot very frequently and during testing we often format our disks to clean out the bricks and then add them back. So, there is quite a lot of 'uncontrolled' self heal going on on our systems. Steps to Reproduce: 1. Remove all the bricks on one of the serves from all replicated volumes. 2. Erase the logical volumes that comprise these brcks. 3. Re-create the bricks and add them back to the replicated volumes causing massive heal of data. Actual results: Sometimes, maybe around once in 20-30 times glustershd memory usage exceeds 50% (8 GB) causing other applications to fail spawn/terminate abruptly. Work around is to kill glustershd, and then restart /etc/init.d/glusterd to get the former to spawn back. Expected results: We would expect the memory usage to fall within a reasonable ceiling, say, 20%? Additional info: Please note that this bug is specifically for high memory consumption by the glusterfs self-heal daemon. I am aware that several other bugs exist in bugzilla catering to generic high memory consumption by glusterfs daemons, or maybe specific ones such as those pertaining to gfs nfs. --- Additional comment from Pranith Kumar K on 2014-07-16 03:11:16 EDT --- I took the statedump and found that the process is leaking 'path' from circular buffers it uses to remember the last 1024 entries that healed/failed/split-brain. http://review.gluster.org/4790 has the fix which enables the data structure to give a cleanup function for freeing the data structure. --- Additional comment from Pranith Kumar K on 2014-07-16 05:55:16 EDT --- Found one more 'dict' leak in metadata self-heal. This leak is present even in 3.5.x. Will be cloning this bug. Thanks a lot Anirban for raising the issue. --- Additional comment from Pranith Kumar K on 2014-07-16 06:34:57 EDT --- 'dict' leak I mentioned above only exists in 3.5.x it seems. So the only leak in 3.4.2 is the one mentioned in comment-1
REVIEW: http://review.gluster.org/8316 (cluster/afr: Fix leaks in self-heal code path) posted (#1) for review on release-3.5 by Pranith Kumar Karampuri (pkarampu)
COMMIT: http://review.gluster.org/8316 committed in release-3.5 by Niels de Vos (ndevos) ------ commit c7fbb78ec198968069821cb0769071d17df1c58b Author: Pranith Kumar K <pkarampu> Date: Wed Jul 16 15:03:19 2014 +0530 cluster/afr: Fix leaks in self-heal code path Change-Id: I5301ec9ebac27afe52e85cad75e6395d7f891355 BUG: 1120151 Signed-off-by: Pranith Kumar K <pkarampu> Reviewed-on: http://review.gluster.org/8316 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Krishnan Parthasarathi <kparthas> Reviewed-by: Ravishankar N <ravishankar> Reviewed-by: Niels de Vos <ndevos>
The first (and last?) Beta for GlusterFS 3.5.2 has been released [1]. Please verify if the release solves this bug report for you. In case the glusterfs-3.5.2beta1 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED. Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution. [1] http://supercolony.gluster.org/pipermail/gluster-devel/2014-July/041636.html [2] http://supercolony.gluster.org/pipermail/gluster-users/
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.5.2, please reopen this bug report. glusterfs-3.5.2 has been announced on the Gluster Users mailinglist [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://supercolony.gluster.org/pipermail/gluster-users/2014-July/041217.html [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user