Bug 1428053 - features/locks: Add lock revocation functionality to posix locks translator
Summary: features/locks: Add lock revocation functionality to posix locks translator
Alias: None
Product: GlusterFS
Classification: Community
Component: locks
Version: mainline
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
Depends On:
TreeView+ depends on / blocked
Reported: 2017-03-01 19:04 UTC by Vijay Bellur
Modified: 2018-10-12 07:44 UTC (History)
2 users (show)

Fixed In Version: glusterfs-3.12.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2018-10-12 07:44:42 UTC
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:

Attachments (Terms of Use)

Description Vijay Bellur 2017-03-01 19:04:56 UTC
features/locks: Add lock revocation functionality to posix locks translator

- Motivation: Prevents cluster instability by mis-behaving clients
  causing bricks to OOM due to inode/entry lock pile-ups.
- Adds option to strip clients of entry/inode locks after N seconds
- Adds option to clear ALL locks should the revocation threshold get hit
- Adds option to clear all or granted locks should the max-blocked
  threshold get hit (can be used in combination w/ revocation-clear-all).
- Options are:
    features.locks-revocation-secs <integer; 0 to disable>
    features.locks-revocation-clear-all [on/off]
    features.locks-revocation-max-blocked <integer>
- Adds monkey-locking option to ignore 1% of unlock requests (dev only)
    features.locks-monkey-unlocking [on/off]
- Adds logging to indicate revocation event & reason

Test Plan:
First you will need TWO fuse mounts for this repro.  Call them /mnt/patchy1 & /mnt/patchy2.

1. Enable monkey unlocking on the volume:
gluster vol set patchy features.locks-monkey-unlocking on

2. From the "patchy1", use DD or some other utility to begin writing to a file,
   eventually the dd will hang due to the dropped unlocked requests.  This now
   simulates the broken client.  Run:

for i in {1..1000};do dd if=/dev/zero of=/mnt/patchy1/testfile bs=1k count=10;done'

...this will eventually hang as the unlock request has been lost.

3. Goto another window and setup the mount "patchy2" @ /mnt/patchy2, and
   observe that 'echo "hello" >> /mnt/patchy2/testfile" will hang due to the
   inability of the client to take out the required lock.

4. Next, re-start the test this time enabling lock revocation; use a timeout of
   2-5 seconds for testing:
   'gluster vol set patchy features.locks-revocation-secs <2-5>'

5. Wait 2-5 seconds before executing step 3 above this time.  Observe that this
   time the access to the file will succeed, and the writes on patchy1 will
   unblock until they hit another failed unlock request due to

Change-Id: I814b9f635fec53834a26db634d1300d9a61057d8
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/14816
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-on: http://review.gluster.org/16086
Tested-by: Shreyas Siravara <sshreyas@fb.com>
Reviewed-by: Kevin Vigor <kvigor@fb.com>

Comment 1 Amar Tumballi 2018-10-12 07:44:42 UTC
the patch is in master!

Note You need to log in before you can comment on or make changes to this bug.