Steps to Reproduce: 1.Create a 6 Node gluster 2.Create an EC volume 2 x (4 + 2).Enable GNFS on the volume 3.Mount the volume to 2 clients. 4.Create a file say file1 of 512 bytes from client 1 5.Now take the lock on the same file from client1 6.Try taking the lock on the same file from client2.(Lock will not be granted for client 2 because it is already held by client 1) 7.Now release the lock from client 1 Client 1: ----- [root@dhcp37-192 home]# ./a.out /mnt/disperse/file1 opening /mnt/disperse/file1 opened; hit Enter to lock... locking locked; hit Enter to write... Write succeeeded locked; hit Enter to unlock... unlocking ----- Client 2 ----- [root@dhcp37-142 home]# ./a.out /mnt/disperse1/file1 opening /mnt/disperse1/file1 opened; hit Enter to lock... locking ----- Actual results: It unable to release the lock from file and gets hang Expected results: It should able to release the lock from client1
REVIEW: https://review.gluster.org/17556 (cluster/ec: lk shouldn't be a transaction) posted (#1) for review on release-3.11 by Pranith Kumar Karampuri (pkarampu)
COMMIT: https://review.gluster.org/17556 committed in release-3.11 by Shyamsundar Ranganathan (srangana) ------ commit 6e377faf4490f20a63634c8baecb76886c0dac8a Author: Pranith Kumar K <pkarampu> Date: Tue Jun 13 23:35:40 2017 +0530 cluster/ec: lk shouldn't be a transaction Problem: When application sends a blocking lock, the lk fop actually waits under inodelk. This can lead to a dead-lock. 1) Let's say app-1 takes exculsive-fcntl-lock on the file 2) app-2 attempts an exclusive-fcntl-lock on the file which goes to blocking stage note: app-2 is blocked inside transaction which holds an inode-lock 3) app-1 tries to perform write which needs inode-lock so it gets blocked on app-2 to unlock inodelk and app-2 is blocked on app-1 to unlock fcntl-lock Fix: Correct way to fix this issue and make fcntl locks perform well would be to introduce 2-phase locking for fcntl lock: 1) Implement a try-lock phase where locks xlator will not merge lk call with existing calls until a commit-lock phase. 2) If in try-lock phase we get quorum number of success without any EAGAIN error, then send a commit-lock which will merge locks. 3) In case there are any errors, unlock should just delete the lock-object which was tried earlier and shouldn't touch the committed locks. Unfortunately this is a sizeable feature and need to be thought through for any corner cases. Until then remove transaction from lk call. >BUG: 1455049 >Change-Id: I18a782903ba0eb43f1e6526fb0cf8c626c460159 >Signed-off-by: Pranith Kumar K <pkarampu> >Reviewed-on: https://review.gluster.org/17542 >Smoke: Gluster Build System <jenkins.org> >NetBSD-regression: NetBSD Build System <jenkins.org> >CentOS-regression: Gluster Build System <jenkins.org> >Reviewed-by: Ashish Pandey <aspandey> >Reviewed-by: Xavier Hernandez <xhernandez> BUG: 1462121 Change-Id: I18a782903ba0eb43f1e6526fb0cf8c626c460159 Signed-off-by: Pranith Kumar K <pkarampu> Reviewed-on: https://review.gluster.org/17556 Smoke: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Shyamsundar Ranganathan <srangana>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.11.1, please open a new bug report. glusterfs-3.11.1 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/announce/2017-June/000074.html [2] https://www.gluster.org/pipermail/gluster-users/