Bug 1301401 - RFE: FEATURE: Lock revocation for features/locks xlator
Summary: RFE: FEATURE: Lock revocation for features/locks xlator
Alias: None
Product: GlusterFS
Classification: Community
Component: locks
Version: 3.7.6
Hardware: All
OS: All
Target Milestone: ---
Assignee: Pranith Kumar K
QA Contact:
Depends On:
Blocks: 1350867
TreeView+ depends on / blocked
Reported: 2016-01-24 20:35 UTC by rwareing
Modified: 2017-03-08 11:00 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
: 1350867 (view as bug list)
Last Closed: 2017-03-08 11:00:33 UTC
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:

Attachments (Terms of Use)
Clean patch for v3.7.6 tag in github repo. (20.35 KB, application/mbox)
2016-01-24 20:35 UTC, rwareing
no flags Details
Prove test for lock revocation->tests/features/lock_revocation.t (1.20 KB, text/plain)
2016-01-25 00:37 UTC, rwareing
no flags Details

Description rwareing 2016-01-24 20:35:22 UTC
Created attachment 1117722 [details]
Clean patch for v3.7.6 tag in github repo.

Description of problem:
Mis-behaving brick clients (gNFSd, FUSE, gfAPI) can cause cluster instability and eventual complete unavailability due to failures in releasing entry/inode locks in a timely manner.

Classic symptoms on this are increased brick (and/or gNFSd) memory usage due the high number of (lock request) frames piling up in the processes.  The failure-mode results in bricks eventually slowing down to a crawl due to swapping, or OOMing due to complete memory exhaustion; during this period the entire cluster can begin to fail.  End-users will experience this as hangs on the filesystem, first in a specific region of the file-system and ultimately the entire filesystem as the offending brick begins to turn into a zombie (i.e. not quite dead, but not quite alive either).

Currently, these situations must be handled by an administrator detecting & intervening via the "clear-locks" CLI command.  Unfortunately this doesn't scale for large numbers of clusters, and it depends on the correct (external) detection of the locks piling up (for which there is little signal other than state dumps).

This patch introduces two features to remedy this situation:

1. Monkey-unlocking - This is a feature targeted at developers (only!) to help track down crashes due to stale locks, and prove the utility of he lock revocation feature.  It does this by silently dropping 1% of unlock requests; simulating bugs or mis-behaving clients.

The feature is activated via:
features.locks-monkey-unlocking <on/off>

You'll see the message
"[<timestamp>] W [inodelk.c:653:pl_inode_setlk] 0-groot-locks: MONKEY LOCKING (forcing stuck lock)!" in the logs indicating a request has been dropped.

2. Lock revocation - Once enabled, this feature will revoke a contended lock either by the amount of time the lock has been held, how many other lock requests are waiting on the lock to be freed, or some combination of both.  Clients which are losing their locks will be notified by receiving EAGAIN (send back to their callback function).

The feature is activated via these options:
features.locks-revocation-secs <integer; 0 to disable>
features.locks-revocation-clear-all [on/off]
features.locks-revocation-max-blocked <integer>

Recommended settings are: 1800 seconds for a time based timeout (give clients the benefit of the doubt, or chose a max-blocked requires some experimentation depending on your workload, but generally values of hundreds to low thousands (it's normal for many ten's of locks to be taken out when files are being written @ high throughput).

Version-Release number of selected component (if applicable):
Clear patch-set provided for GlusterFS v3.7.6, v3.6 patches are available upon request.

How reproducible:
- Without using monkey-unlocking these situations are extremely difficult to reproduce.
- 100% by turning on monkey-unlocking; a crash bug was immediately detected using this feature (and a fix is included with this patch: changes to xlators/features/locks/src/clear.c).

Steps to Reproduce:
First you will need TWO fuse mounts for this repro.  Call them /mnt/patchy1 & /mnt/patchy2.

1. Enable monkey unlocking on the volume:
gluster vol set patchy features.locks-monkey-unlocking on

2. From the "patchy1", use DD or some other utility to begin writing to a file, eventually the dd will hang due to the dropped unlocked requests.  This now simulates the broken client.  Run:

for i in {1..1000};do dd if=/dev/zero of=/mnt/patchy1/testfile bs=1k count=10;done'

...this will eventually hang as the unlock request has been lost.

3. Goto another window and setup the mount "patchy2" @ /mnt/patchy2, and observe that 'echo "hello" >> /mnt/patchy2/testfile" will hang due to the inability of the client to take out the required lock.

4. Next, re-start the test this time enabling lock revocation; use a timeout of 2-5 seconds for testing: 'gluster vol set patchy features.locks-revocation-secs <2-5>'

5. Wait 2-5 seconds before executing step 3 above this time.  Observe that this time the access to the file will succeed, and the writes on patchy1 will unblock until they hit another failed unlock request due to "monkey-unlocking".

Actual results:

Expected results:

Additional info:

Comment 1 rwareing 2016-01-25 00:37:53 UTC
Created attachment 1117735 [details]
Prove test for lock revocation->tests/features/lock_revocation.t

Prove test for lock revocation feature.

Comment 2 Vijay Bellur 2016-06-27 17:47:46 UTC
REVIEW: http://review.gluster.org/14816 (features/locks: Add lock revocation functionality to posix locks translator) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu@redhat.com)

Comment 3 Vijay Bellur 2016-06-27 17:47:49 UTC
REVIEW: http://review.gluster.org/14817 (Revert "tests: remove tests for clear-locks") posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu@redhat.com)

Comment 4 Kaushal 2017-03-08 11:00:33 UTC
This bug is getting closed because GlusteFS-3.7 has reached its end-of-life.

Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS.
If this bug still exists in newer GlusterFS releases, please reopen this bug against the newer release.

Note You need to log in before you can comment on or make changes to this bug.