1371552 – Contending exclusive NFS file locks from two hosts breaks locking when blocked host gives up early.

Bug 1371552 - Contending exclusive NFS file locks from two hosts breaks locking when blocked host gives up early.

Summary: Contending exclusive NFS file locks from two hosts breaks locking when blocke...

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	nfs
Sub Component:
Version:	3.8
Hardware:	Unspecified
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Soumya Koduri
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:	1301805 1371554
Blocks:
TreeView+	depends on / blocked

Reported:	2016-08-30 13:00 UTC by hari gowtham
Modified:	2017-11-07 10:36 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Clone Of:	1301805
Environment:
Last Closed:	2017-11-07 10:36:18 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description hari gowtham 2016-08-30 13:00:03 UTC

+++ This bug was initially created as a clone of Bug #1301805 +++

Contending exclusive NFS file locks from two hosts breaks
locking when blocked host gives up early.

Using the Linux 'flock' utility to test GlusterFS NFS file
locking by contending exclusive locking of the same file by
two different hosts results in broken file locking for that
file when the client waiting for the lock gives up on a
timeout or is process killed. No other file lock attempt will
work from any host again without a "gluster volume stop/start",
a glusterd restart, or deleting the lock file.

This problem is known to occur on glusterfs 3.6.5, and is also
said to occur on glusterfs 3.4.2.

The same test plan works correctly using native Linux kernel
NFS server, the contended file lock does not become broken.

No work-around has been found to evade this problem.

1) Locking from each host one at a time works fine:

[Linux-105 ~]# date; flock -x /mnt/locktest/locktest-1 -c 'date "+locked @ %T"; sleep 10;  date "+unlocked @ %T"'; date "+sts=$? @ %T"
Mon Jan 25 16:59:17 PST 2016
locked @ 16:59:18
unlocked @ 16:59:28
sts=0 @ 16:59:28

[Linux-121]# date; flock -x /mnt/locktest/locktest-1 -c 'date "+locked @ %T"; sleep 10;  date "+unlocked @ %T"'; date "+sts=$? @ %T"
Mon Jan 25 16:59:55 PST 2016
locked @ 16:59:56
unlocked @ 17:00:06
sts=0 @ 17:00:06

2) Locking from both hosts at the same time works fine as long
as the hosts are willing to wait forever, and the second is
not interrupted:

[Linux-105]# date; flock -x /mnt/locktest/locktest-1 -c 'date "+locked @ %T"; sleep 10;  date "+unlocked @ %T"'; date "+sts=$? @ %T"
Mon Jan 25 17:03:52 PST 2016
locked @ 17:03:52
unlocked @ 17:04:02
sts=0 @ 17:04:02

[Linux-121]# date; flock -x /mnt/locktest/locktest-1 -c 'date "+locked @ %T"; sleep 10;  date "+unlocked @ %T"'; date "+sts=$? @ %T"
Mon Jan 25 17:03:54 PST 2016
locked @ 17:04:02
unlocked @ 17:04:12
sts=0 @ 17:04:12

3) Locking from both hosts at the same time, where the second
one is not willing to wait and times out works, but leaves the
lock file in a state where it cannot be locked by anyone ever
again without a gluster volume stop/start, a glusterd restart,
or deleting the lock file:

[Linux-105]# date; flock -w 3 -x /mnt/locktest/locktest-1 -c 'date "+locked @ %T"; sleep 10;  date "+unlocked @ %T"'; date "+sts=$? @ %T"
Mon Jan 25 17:07:59 PST 2016
locked @ 17:07:59
unlocked @ 17:08:09
sts=0 @ 17:08:09

[Linux-121]# date; flock -w 3 -x /mnt/locktest/locktest-1 -c 'date "+locked @ %T"; sleep 10;  date "+unlocked @ %T"'; date "+sts=$? @ %T"
Mon Jan 25 17:08:01 PST 2016
sts=1 @ 17:08:04

[Linux-105]# date; flock -w 3 -x /mnt/locktest/locktest-1 -c 'date "+locked @ %T"; sleep 10;  date "+unlocked @ %T"'; date "+sts=$? @ %T"
Mon Jan 25 17:08:17 PST 2016
sts=1 @ 17:08:20

[Linux-121]# date; flock -w 3 -x /mnt/locktest/locktest-1 -c 'date "+locked @ %T"; sleep 10;  date "+unlocked @ %T"'; date "+sts=$? @ %T"
Mon Jan 25 17:08:27 PST 2016
sts=1 @ 17:08:30

On the GlusterFS server, there are then two file-handles stuck
open on the lockfile by the brick process, and one lock, these
normally disappear when the lock is released:

[Gluster-186]# date; lsof /exports/nas-segment-0001/locktest/locktest-1
Mon Jan 25 17:10:58 PST 2016
COMMAND     PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
glusterfs 31421 root   14w   REG   8,48        0  120 /exports/nas-segment-0001/locktest/locktest-1
glusterfs 31421 root   15w   REG   8,48        0  120 /exports/nas-segment-0001/locktest/locktest-1

[Gluster-186]# fgrep 31421 /proc/locks
2: POSIX  ADVISORY  WRITE 31421 08:04:2329801 0 EOF

[Gluster-186]# ps -elf |grep 31421
/usr/sbin/glusterfsd -s 10.10.60.186 --volfile-id locktest.10.10.60.186.exports-nas-segment-0001-locktest 

[Linux-105]# rm /mnt/locktest/locktest-1

4) Repeating the same test from two hosts but without using
the "-w 3" timeout option, and instead killing the second host
command with ^C fails the same way.

5) Repeating the same test with two shells on the same host
does *not* exhibit the problem:

[Linux-105]# date; flock -w 3 -x /mnt/locktest/locktest-1 -c 'date "+locked @ %T"; sleep 10;  date "+unlocked @ %T"'; date "+sts=$? @ %T"
Mon Jan 25 17:14:21 PST 2016
locked @ 17:14:21
unlocked @ 17:14:31
sts=0 @ 17:14:31

[Linux-105]# date; flock -w 3 -x /mnt/locktest/locktest-1 -c 'date "+locked @ %T"; sleep 10;  date "+unlocked @ %T"'; date "+sts=$? @ %T"
Mon Jan 25 17:14:23 PST 2016
sts=1 @ 17:14:26

[Linux-105]# date; flock -w 3 -x /mnt/locktest/locktest-1 -c 'date "+locked @ %T"; sleep 10;  date "+unlocked @ %T"'; date "+sts=$? @ %T"
Mon Jan 25 17:14:39 PST 2016
locked @ 17:14:39
unlocked @ 17:14:49
sts=0 @ 17:14:49

[Linux-105]# date; flock -w 3 -x /mnt/locktest/locktest-1 -c 'date "+locked @ %T"; sleep 10;  date "+unlocked @ %T"'; date "+sts=$? @ %T"
Mon Jan 25 17:14:53 PST 2016
locked @ 17:14:53
unlocked @ 17:15:03
sts=0 @ 17:15:03

[Linux-105]# date; flock -w 3 -x /mnt/locktest/locktest-1 -c 'date "+locked @ %T"; sleep 10;  date "+unlocked @ %T"'; date "+sts=$? @ %T"
Mon Jan 25 17:14:39 PST 2016
locked @ 17:14:39
unlocked @ 17:14:49
sts=0 @ 17:14:49

5) Additional information:

[Gluster-186]# glusterd -V
glusterfs 3.6.5 built on Sep  2 2015 12:35:56

[Gluster-186]# gluster volume info locktest
Volume Name: locktest
Type: Distribute
Volume ID: f56cb000-47b8-49db-b885-a8ab50333dd2
Status: Started
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: 10.10.60.186:/exports/nas-segment-0001/locktest
Options Reconfigured:
nfs.rpc-auth-allow: *
server.allow-insecure: on
performance.quick-read: off
performance.stat-prefetch: off
nfs.disable: off
nfs.addr-namelookup: off

[Gluster-186]# lsmod|egrep 'nfs|lock'

[Gluster-186]# rpcinfo -p
   program vers proto   port  service
    100000    4   tcp    111  portmapper
    100000    3   tcp    111  portmapper
    100000    2   tcp    111  portmapper
    100000    4   udp    111  portmapper
    100000    3   udp    111  portmapper
    100000    2   udp    111  portmapper
    100005    3   tcp  38465  mountd
    100005    1   tcp  38466  mountd
    100003    3   tcp   2049  nfs
    100021    4   tcp  38468  nlockmgr
    100021    1   udp    703  nlockmgr
    100227    3   tcp   2049  nfs_acl
    100021    1   tcp    705  nlockmgr

[Gluster-186]# netstat -nape |egrep ':38465|:38466|:2049|:38468|:703|:705'
tcp        0      0 0.0.0.0:705                 0.0.0.0:*                   LISTEN      0          2240964    31474/glusterfs
tcp        0      0 0.0.0.0:2049                0.0.0.0:*                   LISTEN      0          2240881    31474/glusterfs
tcp        0      0 0.0.0.0:38465               0.0.0.0:*                   LISTEN      0          2240870    31474/glusterfs
tcp        0      0 0.0.0.0:38466               0.0.0.0:*                   LISTEN      0          2240873    31474/glusterfs
tcp        0      0 0.0.0.0:38468               0.0.0.0:*                   LISTEN      0          2240886    31474/glusterfs
udp        0      0 0.0.0.0:703                 0.0.0.0:*                               0          2240959    31474/glusterfs

[Linux-105]# date; mount|grep nfs
Mon Jan 25 16:53:14 PST 2016
10.10.60.186:/locktest on /mnt/locktest type nfs (rw,vers=3,tcp,addr=10.10.60.186)

[Linux-121]# date; mount|grep nfs
Mon Jan 25 16:54:01 PST 2016
10.10.60.186:/locktest on /mnt/locktest type nfs (rw,vers=3,tcp,addr=10.10.60.186)

--- Additional comment from Jeff Byers on 2016-02-03 09:46:39 EST ---

Confirmed, that as expected, this problem also exists on the latest GlusterFS version 3.6.8.

Comment 1 Niels de Vos 2016-09-12 05:37:56 UTC

All 3.8.x bugs are now reported against version 3.8 (without .x). For more information, see http://www.gluster.org/pipermail/gluster-devel/2016-September/050859.html

Comment 2 Niels de Vos 2017-11-07 10:36:18 UTC

This bug is getting closed because the 3.8 version is marked End-Of-Life. There will be no further updates to this version. Please open a new bug against a version that still receives bugfixes if you are still facing this issue in a more current release.

Note You need to log in before you can comment on or make changes to this bug.