Red Hat Bugzilla – Bug 1253309
AFR: gluster v restart force or brick process restart doesn't heal the files
Last modified: 2016-06-16 09:31:51 EDT
Description of problem:
When one of the replica brick is down and do some file operation, gluster vol restart or brick process restart doesn't heal the files which needs to be healed.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Create 2*2 distribute replicate volume
2. Do fuse mount
3. create some files on mount point
4. kill one of the replica brick
5. rename the file from the mount point
6. check gluster v heal <volname> info
7. restart the volume or restart the brick process
Files are not healed
volume restart or brick process restart should heal the files which need to be healed
Volume Name: vol0
Volume ID: 53c64343-c537-428c-b7b7-a45f198c42a0
Number of Bricks: 2 x 2 = 4
--- Additional comment from Ravishankar N on 2015-07-03 05:45:57 EDT ---
Currently in AFR-v2, when a CHILD_UP notification is received, the index heal is triggered only on that particular child. The fix is to trigger the index heal on all local children.
While this is a bug, it is not a blocker because the files will eventually get healed in 10 minutes (default heal timeout value) or when the heal command is explicitly launched via the gluster CLI.
REVIEW: http://review.gluster.org/11912 (afr: launch index heal on local subvols up on a child-up event) posted (#2) for review on master by Pranith Kumar Karampuri (email@example.com)
REVIEW: http://review.gluster.org/11912 (afr: launch index heal on local subvols up on a child-up event) posted (#3) for review on master by Pranith Kumar Karampuri (firstname.lastname@example.org)
REVIEW: http://review.gluster.org/11912 (afr: launch index heal on local subvols up on a child-up event) posted (#4) for review on master by Ravishankar N (email@example.com)
COMMIT: http://review.gluster.org/11912 committed in master by Pranith Kumar Karampuri (firstname.lastname@example.org)
Author: Ravishankar N <email@example.com>
Date: Thu Aug 13 18:33:08 2015 +0530
afr: launch index heal on local subvols up on a child-up event
When a replica's child goes down and comes up, the index heal is
triggered only on the child that just came up. This does not serve the
intended purpose as the list of files that need to be healed
to this child is actually captured on the other child of the replica.
Launch index-heal on all local children of the replica xlator which just
received a child up. Note that afr_selfheal_childup() eventually calls
afr_shd_index_healer() which will not run the heal on non-local
Signed-off-by: Ravishankar N <firstname.lastname@example.org>
Tested-by: NetBSD Build System <email@example.com>
Tested-by: Gluster Build System <firstname.lastname@example.org>
Reviewed-by: Krutika Dhananjay <email@example.com>
Reviewed-by: Pranith Kumar Karampuri <firstname.lastname@example.org>
Fix for this BZ is already present in a GlusterFS release. You can find clone of this BZ, fixed in a GlusterFS release and closed. Hence closing this mainline BZ as well.
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.
glusterfs-3.8.0 has been announced on the Gluster mailinglists , packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist  and the update infrastructure for your distribution.