+++ This bug was initially created as a clone of Bug #1170913 +++ Description of problem: The reason self-heal daemons first acquire full locks in self-heal domain in AFR-V2, is to ensure that only one of them gets to enter the critical section and do the healing of a file/directory. Although this was being achieved before AFR-V2 by way of holding full locks in xlator domain (the domain where normal modification FOPs take locks in the clients' I/O path) until the given file/directory is healed, the disadvantage with this approach was that the clients in the normal I/O path would be required to wait until this file/dir is healed, and in cases where the to-be-healed files are really large (think VM images), the clients would perceive a hung file system with nothing working. Therefore, to eliminate this, self-heal domain locks were introduced exclusively for use by self-heal daemon. However, metadata self-heal is a relatively fast operation (irrespective of how big the file/directory is). Hence, it can possibly eliminate locking in sh-domain and directly proceed with a blocking locks acquisition in xlator domain, perform metadata healing (a bunch of getxattrs, setxattrs, removexattrs and setattr), and a quick unlock on the held locks. This way, a. No two healers can concurrently perform metadata healing of a file/dir; b. The clients themselves cannot enter the critical section either to modify metadata while selfheal is in progress. c. The clients in the normal i/o path do not have to be blocked for a really long time to do metadata FOPs; Version-Release number of selected component (if applicable): How reproducible: N/A Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: --- Additional comment from Anand Avati on 2014-12-05 01:15:18 EST --- REVIEW: http://review.gluster.org/9240 (cluster/afr: Eliminate locking in sh domain in metadata self-heal) posted (#1) for review on master by Krutika Dhananjay (kdhananj) --- Additional comment from Anand Avati on 2014-12-05 04:51:19 EST --- COMMIT: http://review.gluster.org/9240 committed in master by Pranith Kumar Karampuri (pkarampu) ------ commit 5fdffa7164ffc5a798246411d065259b36658bc3 Author: Krutika Dhananjay <kdhananj> Date: Fri Dec 5 11:16:07 2014 +0530 cluster/afr: Eliminate locking in sh domain in metadata self-heal Change-Id: I9ef25a17c9a43ba06fac2ad3f7c18cb47de91537 BUG: 1170913 Signed-off-by: Krutika Dhananjay <kdhananj> Reviewed-on: http://review.gluster.org/9240 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu> Tested-by: Pranith Kumar Karampuri <pkarampu>
REVIEW: http://review.gluster.org/9246 (cluster/afr: Eliminate locking in sh domain in metadata self-heal) posted (#1) for review on release-3.6 by Krutika Dhananjay (kdhananj)
COMMIT: http://review.gluster.org/9246 committed in release-3.6 by Raghavendra Bhat (raghavendra) ------ commit 1f67254c4d6e6b44a8deee47b66e57d92adeffde Author: Krutika Dhananjay <kdhananj> Date: Fri Dec 5 11:16:07 2014 +0530 cluster/afr: Eliminate locking in sh domain in metadata self-heal Backport of: http://review.gluster.org/9240 Change-Id: I2ab2ad9a02d88c299cfb32e0cf6baa44d1c2ee12 BUG: 1171077 Signed-off-by: Krutika Dhananjay <kdhananj> Reviewed-on: http://review.gluster.org/9246 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu> Reviewed-by: Raghavendra Bhat <raghavendra> Tested-by: Raghavendra Bhat <raghavendra>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-v3.6.2, please open a new bug report. glusterfs-v3.6.2 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://blog.gluster.org/2015/01/glusterfs-3-6-2-ga-released/ [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user