Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1301805

Summary:	Contending exclusive NFS file locks from two hosts breaks locking when blocked host gives up early.
Product:	[Community] GlusterFS	Reporter:	Jeff Byers <jbyers>
Component:	nfs	Assignee:	Soumya Koduri <skoduri>
Status:	CLOSED UPSTREAM	QA Contact:
Severity:	low	Docs Contact:
Priority:	low
Version:	mainline	CC:	bugs, hgowtham, jbyers, jthottan, spalai
Target Milestone:	---	Keywords:	Triaged
Target Release:	---
Hardware:	Unspecified
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:
Clones:	1371552 1371554 (view as bug list)		Environment:
Last Closed:	2020-03-12 12:56:32 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1371552, 1371554

Description Jeff Byers 2016-01-26 01:44:37 UTC

Contending exclusive NFS file locks from two hosts breaks
locking when blocked host gives up early.

Using the Linux 'flock' utility to test GlusterFS NFS file
locking by contending exclusive locking of the same file by
two different hosts results in broken file locking for that
file when the client waiting for the lock gives up on a
timeout or is process killed. No other file lock attempt will
work from any host again without a "gluster volume stop/start",
a glusterd restart, or deleting the lock file.

This problem is known to occur on glusterfs 3.6.5, and is also
said to occur on glusterfs 3.4.2.

The same test plan works correctly using native Linux kernel
NFS server, the contended file lock does not become broken.

No work-around has been found to evade this problem.

1) Locking from each host one at a time works fine:

[Linux-105 ~]# date; flock -x /mnt/locktest/locktest-1 -c 'date "+locked @ %T"; sleep 10;  date "+unlocked @ %T"'; date "+sts=$? @ %T"
Mon Jan 25 16:59:17 PST 2016
locked @ 16:59:18
unlocked @ 16:59:28
sts=0 @ 16:59:28

[Linux-121]# date; flock -x /mnt/locktest/locktest-1 -c 'date "+locked @ %T"; sleep 10;  date "+unlocked @ %T"'; date "+sts=$? @ %T"
Mon Jan 25 16:59:55 PST 2016
locked @ 16:59:56
unlocked @ 17:00:06
sts=0 @ 17:00:06

2) Locking from both hosts at the same time works fine as long
as the hosts are willing to wait forever, and the second is
not interrupted:

[Linux-105]# date; flock -x /mnt/locktest/locktest-1 -c 'date "+locked @ %T"; sleep 10;  date "+unlocked @ %T"'; date "+sts=$? @ %T"
Mon Jan 25 17:03:52 PST 2016
locked @ 17:03:52
unlocked @ 17:04:02
sts=0 @ 17:04:02

[Linux-121]# date; flock -x /mnt/locktest/locktest-1 -c 'date "+locked @ %T"; sleep 10;  date "+unlocked @ %T"'; date "+sts=$? @ %T"
Mon Jan 25 17:03:54 PST 2016
locked @ 17:04:02
unlocked @ 17:04:12
sts=0 @ 17:04:12

3) Locking from both hosts at the same time, where the second
one is not willing to wait and times out works, but leaves the
lock file in a state where it cannot be locked by anyone ever
again without a gluster volume stop/start, a glusterd restart,
or deleting the lock file:

[Linux-105]# date; flock -w 3 -x /mnt/locktest/locktest-1 -c 'date "+locked @ %T"; sleep 10;  date "+unlocked @ %T"'; date "+sts=$? @ %T"
Mon Jan 25 17:07:59 PST 2016
locked @ 17:07:59
unlocked @ 17:08:09
sts=0 @ 17:08:09

[Linux-121]# date; flock -w 3 -x /mnt/locktest/locktest-1 -c 'date "+locked @ %T"; sleep 10;  date "+unlocked @ %T"'; date "+sts=$? @ %T"
Mon Jan 25 17:08:01 PST 2016
sts=1 @ 17:08:04

[Linux-105]# date; flock -w 3 -x /mnt/locktest/locktest-1 -c 'date "+locked @ %T"; sleep 10;  date "+unlocked @ %T"'; date "+sts=$? @ %T"
Mon Jan 25 17:08:17 PST 2016
sts=1 @ 17:08:20

[Linux-121]# date; flock -w 3 -x /mnt/locktest/locktest-1 -c 'date "+locked @ %T"; sleep 10;  date "+unlocked @ %T"'; date "+sts=$? @ %T"
Mon Jan 25 17:08:27 PST 2016
sts=1 @ 17:08:30

On the GlusterFS server, there are then two file-handles stuck
open on the lockfile by the brick process, and one lock, these
normally disappear when the lock is released:

[Gluster-186]# date; lsof /exports/nas-segment-0001/locktest/locktest-1
Mon Jan 25 17:10:58 PST 2016
COMMAND     PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
glusterfs 31421 root   14w   REG   8,48        0  120 /exports/nas-segment-0001/locktest/locktest-1
glusterfs 31421 root   15w   REG   8,48        0  120 /exports/nas-segment-0001/locktest/locktest-1

[Gluster-186]# fgrep 31421 /proc/locks
2: POSIX  ADVISORY  WRITE 31421 08:04:2329801 0 EOF

[Gluster-186]# ps -elf |grep 31421
/usr/sbin/glusterfsd -s 10.10.60.186 --volfile-id locktest.10.10.60.186.exports-nas-segment-0001-locktest 

[Linux-105]# rm /mnt/locktest/locktest-1

4) Repeating the same test from two hosts but without using
the "-w 3" timeout option, and instead killing the second host
command with ^C fails the same way.

5) Repeating the same test with two shells on the same host
does *not* exhibit the problem:

[Linux-105]# date; flock -w 3 -x /mnt/locktest/locktest-1 -c 'date "+locked @ %T"; sleep 10;  date "+unlocked @ %T"'; date "+sts=$? @ %T"
Mon Jan 25 17:14:21 PST 2016
locked @ 17:14:21
unlocked @ 17:14:31
sts=0 @ 17:14:31

[Linux-105]# date; flock -w 3 -x /mnt/locktest/locktest-1 -c 'date "+locked @ %T"; sleep 10;  date "+unlocked @ %T"'; date "+sts=$? @ %T"
Mon Jan 25 17:14:23 PST 2016
sts=1 @ 17:14:26

[Linux-105]# date; flock -w 3 -x /mnt/locktest/locktest-1 -c 'date "+locked @ %T"; sleep 10;  date "+unlocked @ %T"'; date "+sts=$? @ %T"
Mon Jan 25 17:14:39 PST 2016
locked @ 17:14:39
unlocked @ 17:14:49
sts=0 @ 17:14:49

[Linux-105]# date; flock -w 3 -x /mnt/locktest/locktest-1 -c 'date "+locked @ %T"; sleep 10;  date "+unlocked @ %T"'; date "+sts=$? @ %T"
Mon Jan 25 17:14:53 PST 2016
locked @ 17:14:53
unlocked @ 17:15:03
sts=0 @ 17:15:03

[Linux-105]# date; flock -w 3 -x /mnt/locktest/locktest-1 -c 'date "+locked @ %T"; sleep 10;  date "+unlocked @ %T"'; date "+sts=$? @ %T"
Mon Jan 25 17:14:39 PST 2016
locked @ 17:14:39
unlocked @ 17:14:49
sts=0 @ 17:14:49

5) Additional information:

[Gluster-186]# glusterd -V
glusterfs 3.6.5 built on Sep  2 2015 12:35:56

[Gluster-186]# gluster volume info locktest
Volume Name: locktest
Type: Distribute
Volume ID: f56cb000-47b8-49db-b885-a8ab50333dd2
Status: Started
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: 10.10.60.186:/exports/nas-segment-0001/locktest
Options Reconfigured:
nfs.rpc-auth-allow: *
server.allow-insecure: on
performance.quick-read: off
performance.stat-prefetch: off
nfs.disable: off
nfs.addr-namelookup: off

[Gluster-186]# lsmod|egrep 'nfs|lock'

[Gluster-186]# rpcinfo -p
   program vers proto   port  service
    100000    4   tcp    111  portmapper
    100000    3   tcp    111  portmapper
    100000    2   tcp    111  portmapper
    100000    4   udp    111  portmapper
    100000    3   udp    111  portmapper
    100000    2   udp    111  portmapper
    100005    3   tcp  38465  mountd
    100005    1   tcp  38466  mountd
    100003    3   tcp   2049  nfs
    100021    4   tcp  38468  nlockmgr
    100021    1   udp    703  nlockmgr
    100227    3   tcp   2049  nfs_acl
    100021    1   tcp    705  nlockmgr

[Gluster-186]# netstat -nape |egrep ':38465|:38466|:2049|:38468|:703|:705'
tcp        0      0 0.0.0.0:705                 0.0.0.0:*                   LISTEN      0          2240964    31474/glusterfs
tcp        0      0 0.0.0.0:2049                0.0.0.0:*                   LISTEN      0          2240881    31474/glusterfs
tcp        0      0 0.0.0.0:38465               0.0.0.0:*                   LISTEN      0          2240870    31474/glusterfs
tcp        0      0 0.0.0.0:38466               0.0.0.0:*                   LISTEN      0          2240873    31474/glusterfs
tcp        0      0 0.0.0.0:38468               0.0.0.0:*                   LISTEN      0          2240886    31474/glusterfs
udp        0      0 0.0.0.0:703                 0.0.0.0:*                               0          2240959    31474/glusterfs

[Linux-105]# date; mount|grep nfs
Mon Jan 25 16:53:14 PST 2016
10.10.60.186:/locktest on /mnt/locktest type nfs (rw,vers=3,tcp,addr=10.10.60.186)

[Linux-121]# date; mount|grep nfs
Mon Jan 25 16:54:01 PST 2016
10.10.60.186:/locktest on /mnt/locktest type nfs (rw,vers=3,tcp,addr=10.10.60.186)

Comment 1 Jeff Byers 2016-02-03 14:46:39 UTC

Confirmed, that as expected, this problem also exists on the latest GlusterFS version 3.6.8.

Comment 2 Worker Ant 2020-03-12 12:56:32 UTC

This bug is moved to https://github.com/gluster/glusterfs/issues/972, and will be tracked there from now on. Visit GitHub issues URL for further details