+++ This bug was initially created as a clone of Bug #1301401 +++ Description of problem: Mis-behaving brick clients (gNFSd, FUSE, gfAPI) can cause cluster instability and eventual complete unavailability due to failures in releasing entry/inode locks in a timely manner. Classic symptoms on this are increased brick (and/or gNFSd) memory usage due the high number of (lock request) frames piling up in the processes. The failure-mode results in bricks eventually slowing down to a crawl due to swapping, or OOMing due to complete memory exhaustion; during this period the entire cluster can begin to fail. End-users will experience this as hangs on the filesystem, first in a specific region of the file-system and ultimately the entire filesystem as the offending brick begins to turn into a zombie (i.e. not quite dead, but not quite alive either). Currently, these situations must be handled by an administrator detecting & intervening via the "clear-locks" CLI command. Unfortunately this doesn't scale for large numbers of clusters, and it depends on the correct (external) detection of the locks piling up (for which there is little signal other than state dumps). This patch introduces two features to remedy this situation: 1. Monkey-unlocking - This is a feature targeted at developers (only!) to help track down crashes due to stale locks, and prove the utility of he lock revocation feature. It does this by silently dropping 1% of unlock requests; simulating bugs or mis-behaving clients. The feature is activated via: features.locks-monkey-unlocking <on/off> You'll see the message "[<timestamp>] W [inodelk.c:653:pl_inode_setlk] 0-groot-locks: MONKEY LOCKING (forcing stuck lock)!" in the logs indicating a request has been dropped. 2. Lock revocation - Once enabled, this feature will revoke a contended lock either by the amount of time the lock has been held, how many other lock requests are waiting on the lock to be freed, or some combination of both. Clients which are losing their locks will be notified by receiving EAGAIN (send back to their callback function). The feature is activated via these options: features.locks-revocation-secs <integer; 0 to disable> features.locks-revocation-clear-all [on/off] features.locks-revocation-max-blocked <integer> Recommended settings are: 1800 seconds for a time based timeout (give clients the benefit of the doubt, or chose a max-blocked requires some experimentation depending on your workload, but generally values of hundreds to low thousands (it's normal for many ten's of locks to be taken out when files are being written @ high throughput). Version-Release number of selected component (if applicable): Clear patch-set provided for GlusterFS v3.7.6, v3.6 patches are available upon request. How reproducible: - Without using monkey-unlocking these situations are extremely difficult to reproduce. - 100% by turning on monkey-unlocking; a crash bug was immediately detected using this feature (and a fix is included with this patch: changes to xlators/features/locks/src/clear.c). Steps to Reproduce: First you will need TWO fuse mounts for this repro. Call them /mnt/patchy1 & /mnt/patchy2. 1. Enable monkey unlocking on the volume: gluster vol set patchy features.locks-monkey-unlocking on 2. From the "patchy1", use DD or some other utility to begin writing to a file, eventually the dd will hang due to the dropped unlocked requests. This now simulates the broken client. Run: for i in {1..1000};do dd if=/dev/zero of=/mnt/patchy1/testfile bs=1k count=10;done' ...this will eventually hang as the unlock request has been lost. 3. Goto another window and setup the mount "patchy2" @ /mnt/patchy2, and observe that 'echo "hello" >> /mnt/patchy2/testfile" will hang due to the inability of the client to take out the required lock. 4. Next, re-start the test this time enabling lock revocation; use a timeout of 2-5 seconds for testing: 'gluster vol set patchy features.locks-revocation-secs <2-5>' 5. Wait 2-5 seconds before executing step 3 above this time. Observe that this time the access to the file will succeed, and the writes on patchy1 will unblock until they hit another failed unlock request due to "monkey-unlocking". Actual results: n/a Expected results: n/a Additional info: --- Additional comment from on 2016-01-24 19:37 EST --- Prove test for lock revocation feature. --- Additional comment from Vijay Bellur on 2016-06-27 13:47:46 EDT --- REVIEW: http://review.gluster.org/14816 (features/locks: Add lock revocation functionality to posix locks translator) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu) --- Additional comment from Vijay Bellur on 2016-06-27 13:47:49 EDT --- REVIEW: http://review.gluster.org/14817 (Revert "tests: remove tests for clear-locks") posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/14816 (features/locks: Add lock revocation functionality to posix locks translator) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/14817 (Revert "tests: remove tests for clear-locks") posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/14816 (features/locks: Add lock revocation functionality to posix locks translator) posted (#3) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/14817 (Revert "tests: remove tests for clear-locks") posted (#3) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/14816 (features/locks: Add lock revocation functionality to posix locks translator) posted (#4) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/14817 (Revert "tests: remove tests for clear-locks") posted (#4) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/14816 (features/locks: Add lock revocation functionality to posix locks translator) posted (#5) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/14817 (Revert "tests: remove tests for clear-locks") posted (#5) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/14816 (features/locks: Add lock revocation functionality to posix locks translator) posted (#6) for review on master by Pranith Kumar Karampuri (pkarampu)
COMMIT: http://review.gluster.org/14816 committed in master by Pranith Kumar Karampuri (pkarampu) ------ commit 8cbee639520bf4631ce658e2da9b4bc3010d2eaa Author: Richard Wareing <rwareing> Date: Fri Nov 20 10:59:00 2015 -0800 features/locks: Add lock revocation functionality to posix locks translator Summary: - Motivation: Prevents cluster instability by mis-behaving clients causing bricks to OOM due to inode/entry lock pile-ups. - Adds option to strip clients of entry/inode locks after N seconds - Adds option to clear ALL locks should the revocation threshold get hit - Adds option to clear all or granted locks should the max-blocked threshold get hit (can be used in combination w/ revocation-clear-all). - Options are: features.locks-revocation-secs <integer; 0 to disable> features.locks-revocation-clear-all [on/off] features.locks-revocation-max-blocked <integer> - Adds monkey-locking option to ignore 1% of unlock requests (dev only) features.locks-monkey-unlocking [on/off] - Adds logging to indicate revocation event & reason Test Plan: First you will need TWO fuse mounts for this repro. Call them /mnt/patchy1 & /mnt/patchy2. 1. Enable monkey unlocking on the volume: gluster vol set patchy features.locks-monkey-unlocking on 2. From the "patchy1", use DD or some other utility to begin writing to a file, eventually the dd will hang due to the dropped unlocked requests. This now simulates the broken client. Run: for i in {1..1000};do dd if=/dev/zero of=/mnt/patchy1/testfile bs=1k count=10;done' ...this will eventually hang as the unlock request has been lost. 3. Goto another window and setup the mount "patchy2" @ /mnt/patchy2, and observe that 'echo "hello" >> /mnt/patchy2/testfile" will hang due to the inability of the client to take out the required lock. 4. Next, re-start the test this time enabling lock revocation; use a timeout of 2-5 seconds for testing: 'gluster vol set patchy features.locks-revocation-secs <2-5>' 5. Wait 2-5 seconds before executing step 3 above this time. Observe that this time the access to the file will succeed, and the writes on patchy1 will unblock until they hit another failed unlock request due to "monkey-unlocking". BUG: 1350867 Change-Id: I814b9f635fec53834a26db634d1300d9a61057d8 Signed-off-by: Pranith Kumar K <pkarampu> Reviewed-on: http://review.gluster.org/14816 NetBSD-regression: NetBSD Build System <jenkins.org> Reviewed-by: Krutika Dhananjay <kdhananj> CentOS-regression: Gluster Build System <jenkins.org> Smoke: Gluster Build System <jenkins.org>
COMMIT: http://review.gluster.org/14817 committed in master by Jeff Darcy (jdarcy) ------ commit beeaa074611433d6c650823624569f90025160d3 Author: Pranith Kumar K <pkarampu> Date: Mon Jun 27 14:56:04 2016 +0530 Revert "tests: remove tests for clear-locks" This reverts commit 0086a55bb7de1ef5dc7a24583f5fc2b560e835fd. As part of Richard's patch for lock-revocation feature this bug is completely fixed (I think at least ;-) ). So bringing these back so that we will find out if there are anymore things we need to address in this code path. BUG: 1350867 Change-Id: If1440fc83b376576ae1a77b1156188a6bf53fe3a Signed-off-by: Pranith Kumar K <pkarampu> Reviewed-on: http://review.gluster.org/14817 NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Smoke: Gluster Build System <jenkins.org> Reviewed-by: Jeff Darcy <jdarcy>
REVIEW: http://review.gluster.org/16086 (features/locks: Add lock revocation functionality to posix locks translator) posted (#1) for review on release-3.8-fb by Shreyas Siravara (sshreyas)
REVIEW: http://review.gluster.org/16086 (features/locks: Add lock revocation functionality to posix locks translator) posted (#2) for review on release-3.8-fb by Shreyas Siravara (sshreyas)
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.9.0, please open a new bug report. glusterfs-3.9.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/gluster-users/2016-November/029281.html [2] https://www.gluster.org/pipermail/gluster-users/