+++ This bug was initially created as a clone of Bug #1269470 +++ Description of problem: When all the bricks go down at the time of data self-heal. Self-heal daemon process is crashing with following bt: (gdb) bt #0 0x00007fae978ccb0f in afr_local_replies_wipe (local=0x0, priv=0x7fae900125b0) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/cluster/afr/src/afr-common.c:1241 #1 0x00007fae978b7aaf in afr_selfheal_inodelk (frame=0x7fae8c000c0c, this=0x7fae9000a6d0, inode=0x7fae8c00609c, dom=0x7fae900099f0 "patchy-replicate-0", off=8126464, size=131072, locked_on=0x7fae96b4f110 "") at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/cluster/afr/src/afr-self-heal-common.c:879 #2 0x00007fae978bbeb5 in afr_selfheal_data_block (frame=0x7fae8c000c0c, this=0x7fae9000a6d0, fd=0x7fae8c006e6c, source=0, healed_sinks=0x7fae96b4f8a0 "", offset=8126464, size=131072, type=1, replies=0x7fae96b4f2b0) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/cluster/afr/src/afr-self-heal-data.c:243 #3 0x00007fae978bc91d in afr_selfheal_data_do (frame=0x7fae8c006c9c, this=0x7fae9000a6d0, fd=0x7fae8c006e6c, source=0, healed_sinks=0x7fae96b4f8a0 "", replies=0x7fae96b4f2b0) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/cluster/afr/src/afr-self-heal-data.c:365 #4 0x00007fae978bdc7b in __afr_selfheal_data (frame=0x7fae8c006c9c, this=0x7fae9000a6d0, fd=0x7fae8c006e6c, locked_on=0x7fae96b4fa00 "\001\001\240") at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/cluster/afr/src/afr-self-heal-data.c:719 #5 0x00007fae978be0a0 in afr_selfheal_data (frame=0x7fae8c006c9c, this=0x7fae9000a6d0, inode=0x7fae8c00609c) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/cluster/afr/src/afr-self-heal-data.c:808 #6 0x00007fae978ba4d7 in afr_selfheal_do (frame=0x7fae8c006c9c, this=0x7fae9000a6d0, gfid=0x7fae96b4fc30 "s\303\315$w\244M\026\205`\226\336\263\205\300qЦ") at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/cluster/afr/src/afr-self-heal-common.c:1335 #7 0x00007fae978ba613 in afr_selfheal (this=0x7fae9000a6d0, gfid=0x7fae96b4fc30 "s\303\315$w\244M\026\205`\226\336\263\205\300qЦ") at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/cluster/afr/src/afr-self-heal-common.c:1380 #8 0x00007fae978c3e20 in afr_shd_selfheal (healer=0x7fae90013130, child=0, gfid=0x7fae96b4fc30 "s\303\315$w\244M\026\205`\226\336\263\205\300qЦ") at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/cluster/afr/src/afr-self-heald.c:326 #9 0x00007fae978c4142 in afr_shd_index_heal (subvol=0x7fae90006e50, entry=0x7fae90002900, parent=0x7fae96b4fdd0, data=0x7fae90013130) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/cluster/afr/src/afr-self-heald.c:416 #10 0x00007faea482aa83 in syncop_dir_scan (subvol=0x7fae90006e50, loc=0x7fae96b4fdd0, pid=-6, data=0x7fae90013130, fn=0x7fae978c4034 <afr_shd_index_heal>) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/syncop-utils.c:262 #11 0x00007fae978c42bb in afr_shd_index_sweep (healer=0x7fae90013130) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/cluster/afr/src/afr-self-heald.c:450 #12 0x00007fae978c4553 in afr_shd_index_healer (data=0x7fae90013130) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/cluster/afr/src/afr-self-heald.c:518 #13 0x00007faea3a90a51 in start_thread () from ./lib64/libpthread.so.0 #14 0x00007faea33fa93d in clone () from ./lib64/libc.so.6 (gdb) p local->child_up[0] No symbol "local" in current context. (gdb) p priv->child_up[0] $8 = 0 '\000' (gdb) p priv->child_up[1] $9 = 0 '\000' AFR_STACK_RESET() can fail to create local when the bricks are all down, which leads to the crash. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
http://review.gluster.org/12310
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.6, please open a new bug report. glusterfs-3.7.6 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://www.gluster.org/pipermail/gluster-users/2015-November/024359.html [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user