+++ This bug was initially created as a clone of Bug #1297172 +++ Description of problem: If a lookup or a read transaction FOP triggers an inode refresh, the FOP does not return until the heal completes. For VM use cases, this could mean the VM appearing to go to an unresponsive state until the heal completes. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1.Create a 1x2 replica, fuse mount and create a file. 2.Disable self-heal daemon 2.Kill a brick, `dd` a few gigs into the file. 3.Bring the brick back up, do a hexdump of file from the mount. 4.Hexdump will stall spewing out data until the data heal completes (as seen from the mount log) Actual results: FOP blocks until heal is done. Expected results: FOP should not wait for heals- they could be made to happen in the background. --- Additional comment from Vijay Bellur on 2016-01-10 02:21:14 EST --- REVIEW: http://review.gluster.org/13207 (afr: Add throttled background client-side heals) posted (#1) for review on master by Ravishankar N (ravishankar) --- Additional comment from Vijay Bellur on 2016-01-12 05:42:45 EST --- REVIEW: http://review.gluster.org/13207 (afr: Add throttled background client-side heals) posted (#2) for review on master by Ravishankar N (ravishankar) --- Additional comment from Brad Hubbard on 2016-01-21 19:29:38 EST --- Raising severity based on the bugs depending on this one. --- Additional comment from Vijay Bellur on 2016-02-01 06:52:28 EST --- REVIEW: http://review.gluster.org/13207 (afr: Add throttled background client-side heals) posted (#3) for review on master by Ravishankar N (ravishankar) --- Additional comment from Vijay Bellur on 2016-02-03 01:02:15 EST --- REVIEW: http://review.gluster.org/13207 (afr: Add throttled background client-side heals) posted (#4) for review on master by Ravishankar N (ravishankar) --- Additional comment from Vijay Bellur on 2016-02-05 07:27:17 EST --- REVIEW: http://review.gluster.org/13207 (afr: Add throttled background client-side heals) posted (#5) for review on master by Ravishankar N (ravishankar) --- Additional comment from Vijay Bellur on 2016-02-25 03:22:55 EST --- REVIEW: http://review.gluster.org/13207 (afr: Add throttled background client-side heals) posted (#6) for review on master by Ravishankar N (ravishankar) --- Additional comment from Vijay Bellur on 2016-02-25 21:13:01 EST --- REVIEW: http://review.gluster.org/13207 (afr: Add throttled background client-side heals) posted (#7) for review on master by Ravishankar N (ravishankar) --- Additional comment from Vijay Bellur on 2016-02-27 23:06:59 EST --- REVIEW: http://review.gluster.org/13207 (afr: Add throttled background client-side heals) posted (#8) for review on master by Ravishankar N (ravishankar) --- Additional comment from Vijay Bellur on 2016-02-28 07:14:43 EST --- REVIEW: http://review.gluster.org/13207 (afr: Add throttled background client-side heals) posted (#9) for review on master by Ravishankar N (ravishankar) --- Additional comment from Vijay Bellur on 2016-03-01 06:23:32 EST --- COMMIT: http://review.gluster.org/13207 committed in master by Pranith Kumar Karampuri (pkarampu) ------ commit 8210ca1a5c0e78e91c6fab7df7e002e39660b706 Author: Ravishankar N <ravishankar> Date: Sun Jan 10 09:19:34 2016 +0530 afr: Add throttled background client-side heals If a heal is needed after inode refresh (lookup, read_txn), launch it in the background instead of blocking the fop (that triggered refresh) until the heal happens. afr_replies_interpret() is modified such that the heal is launched only if atleast one sink brick is up. Max. no of heals that can happen in parallel is configurable via the 'background-self-heal-count' volume option. Any number greater than that is put in a wait queue whose length is configurable via 'heal-wait-queue-leng' volume option. If the wait queue is also full, further heals will be ignored. Default values: background-self-heal-count=8, heal-wait-queue-leng=128 Change-Id: I1d4a52814cdfd43d90591b6d2ad7b6219937ce70 BUG: 1297172 Signed-off-by: Ravishankar N <ravishankar> Reviewed-on: http://review.gluster.org/13207 Smoke: Gluster Build System <jenkins.com> CentOS-regression: Gluster Build System <jenkins.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu> Tested-by: Pranith Kumar Karampuri <pkarampu> NetBSD-regression: NetBSD Build System <jenkins.org>
REVIEW: http://review.gluster.org/13564 (afr: Add throttled background client-side heals) posted (#1) for review on release-3.7 by Ravishankar N (ravishankar)
REVIEW: http://review.gluster.org/13564 (afr: Add throttled background client-side heals) posted (#2) for review on release-3.7 by Ravishankar N (ravishankar)
COMMIT: http://review.gluster.org/13564 committed in release-3.7 by Pranith Kumar Karampuri (pkarampu) ------ commit 4a5d8f65b9b04385dcae8b16a650f4e8ed357f8b Author: Ravishankar N <ravishankar> Date: Tue Mar 22 14:26:32 2016 +0530 afr: Add throttled background client-side heals Backport of: http://review.gluster.org/13207 If a heal is needed after inode refresh (lookup, read_txn), launch it in the background instead of blocking the fop (that triggered refresh) until the heal happens. afr_replies_interpret() is modified such that the heal is launched only if atleast one sink brick is up. Max. no of heals that can happen in parallel is configurable via the 'background-self-heal-count' volume option. Any number greater than that is put in a wait queue whose length is configurable via 'heal-wait-queue-leng' volume option. If the wait queue is also full, further heals will be ignored. Default values: background-self-heal-count=8, heal-wait-queue-leng=128 Change-Id: I9a134b2c29d66b70b7b1278811bd504963aabacc BUG: 1313312 Signed-off-by: Ravishankar N <ravishankar> Reviewed-on: http://review.gluster.org/13564 Smoke: Gluster Build System <jenkins.com> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.10, please open a new bug report. glusterfs-3.7.10 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] https://www.gluster.org/pipermail/gluster-users/2016-April/026164.html [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user