Description of problem: When iozone is running and self-heal-daemon is enabled, self-heal-daemon crashes with following trace: (gdb) bt #0 0x00000034ee635a19 in raise () from /lib64/libc.so.6 #1 0x00000034ee637128 in abort () from /lib64/libc.so.6 #2 0x00000034ee62e986 in __assert_fail_base () from /lib64/libc.so.6 #3 0x00000034ee62ea32 in __assert_fail () from /lib64/libc.so.6 #4 0x00007f9ef9d7dd0d in __inode_forget (inode=0x7f9eef3052bc, nlookup=1) at inode.c:591 #5 0x00007f9ef9d7e80e in inode_forget (inode=0x7f9eef3052bc, nlookup=1) at inode.c:941 #6 0x00007f9eef5b2c70 in afr_selfheal (this=0xc4c7d0, gfid=0x7f9eee466dd0 "\362\351Z\210R\214@g\204\001\311p(VV\270\060", <incomplete sequence \314>) at afr-self-heal-common.c:1001 #7 0x00007f9eef5ba270 in afr_shd_selfheal (healer=0xc6ee00, child=0, gfid=0x7f9eee466dd0 "\362\351Z\210R\214@g\204\001\311p(VV\270\060", <incomplete sequence \314>) at afr-self-heald.c:301 #8 0x00007f9eef5ba6de in afr_shd_index_sweep (healer=0xc6ee00) at afr-self-heald.c:427 #9 0x00007f9eef5bab36 in afr_shd_index_healer (data=0xc6ee00) at afr-self-heald.c:539 #10 0x00000034eea07c53 in start_thread () from /lib64/libpthread.so.0 #11 0x00000034ee6f513d in clone () from /lib64/libc.so.6 Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
REVIEW: http://review.gluster.org/7567 (cluster/afr: Fix inode_forget assert failure) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/7567 (cluster/afr: Fix inode_forget assert failure) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/7567 (cluster/afr: Fix inode_forget assert failure) posted (#3) for review on master by Pranith Kumar Karampuri (pkarampu)
COMMIT: http://review.gluster.org/7567 committed in master by Anand Avati (avati) ------ commit 49e2d5162013ccf5f3f99c68c2521ca1cc6c3f20 Author: Pranith Kumar K <pkarampu> Date: Fri Apr 25 20:36:11 2014 +0530 cluster/afr: Fix inode_forget assert failure Problem: If two self-heals are triggered on same inode in parallel then one inode will be linked and the other inode will not be linked as an inode with that gfid is already linked in inode table. Calling inode-forget on that inode leads to assert failure. Fix: Always use linked inode for performing self-heal. Added inode-forgets in other places as well even though its not really a memory leak. Change-Id: Ib84bf080c8cb6a4243f66541ece587db28f9a052 BUG: 1091597 Signed-off-by: Pranith Kumar K <pkarampu> Reviewed-on: http://review.gluster.org/7567 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Anand Avati <avati>
A beta release for GlusterFS 3.6.0 has been released. Please verify if the release solves this bug report for you. In case the glusterfs-3.6.0beta1 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED. Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution. [1] http://supercolony.gluster.org/pipermail/gluster-users/2014-September/018836.html [2] http://supercolony.gluster.org/pipermail/gluster-users/
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.6.1, please reopen this bug report. glusterfs-3.6.1 has been announced [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://supercolony.gluster.org/pipermail/gluster-users/2014-November/019410.html [2] http://supercolony.gluster.org/mailman/listinfo/gluster-users