1779089 – glusterfsd do not release posix lock when multiple glusterfs client do flock -xo to the same file paralleled

Bug 1779089 - glusterfsd do not release posix lock when multiple glusterfs client do flock -xo to the same file paralleled

Summary: glusterfsd do not release posix lock when multiple glusterfs client do flock ...

Keywords:
Status:	CLOSED NEXTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	locks
Sub Component:
Version:	mainline
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Assignee:	bugs@gluster.org
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:	1776152
Blocks:	1851315
TreeView+	depends on / blocked

Reported:	2019-12-03 09:35 UTC by Susant Kumar Palai
Modified:	2020-06-26 06:39 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Clone Of:	1776152
Clones:	1851315 (view as bug list)
Environment:
Last Closed:	2020-03-12 13:24:37 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Gluster.org Gerrit	23794	0	None	Merged	add clean local after grant lock	2020-04-01 06:34:59 UTC

Description Susant Kumar Palai 2019-12-03 09:35:02 UTC

+++ This bug was initially created as a clone of Bug #1776152 +++

Description of problem:
glusterfsd do not release posix lock when multiple glusterfs client do flock -xo to the same file paralleled

Version-Release number of selected component (if applicable):
glusterfs7.0

How reproducible:


Steps to Reproduce:
1. create a volume with one brick
   gluster volume create test3  192.168.0.14:/mnt/vol3-test force
2. mount the brick on two different node
  node name: node2
       mkdir /mnt/test-vol3
       mount -t glusterfs 192.168.0.14:/test3 /mnt/test-vol3
  node name: test
       mkdir /mnt/test-vol3
       mount -t glusterfs 192.168.0.14:/test3 /mnt/test-vol3

3.prepare same script to do flock on the two nodes
  [root@node2 ~]# vi flock.sh 

#!/bin/bash
file=/mnt/test-vol3/test.log
touch $file
(

         flock -xo 200
         echo "client1 do something" > $file
         sleep 1

 ) 200>$file
[root@node2 ~]# vi repeat_flock.sh 

#!/bin/bash
i=1
while [ "1" = "1" ]
do
    ./flock.sh
    ((i=i+1))
    echo $i
done
similar script on "test" node
[root@test ~]# vi flock.sh 

#!/bin/bash
file=/mnt/test-vol3/test.log
touch $file
(
         flock -xo 200
         echo "client2 do something" > $file
         sleep 1

 ) 200>$file

[root@test ~]# vi repeat_flock.sh 

#!/bin/bash
i=1
while [ "1" = "1" ]
do
    ./flock.sh
    ((i=i+1))
    echo $i
done

4. start repeat_flock.sh on two nodes
  not need much time, the two scripts will stuck, 

   [root@test ~]# ./repeat_flock.sh
2
3
4
5
6
7
   [root@node2 ~]# ./repeat_flock.sh
2
issue reproduced

5. do statedump on the volume test3
  gluster v statedump test3
[xlator.features.locks.test3-locks.inode]
path=/test.log
mandatory=0
posixlk-count=3
posixlk.posixlk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, pid = 22752, owner=8c9cd93f8ee486a0, client=0x7f76e8082100, connection-id=CTX_ID:7da20ab3-cc70-41bd-ab83-955481288ba2-GRAPH_ID:0-PID:22649-HOST:node2-PC_NAME:test3-client-0-RECON_NO:-0, blocked at 2019-11-25 08:30:12, granted at 2019-11-25 08:30:12
posixlk.posixlk[1](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid = 10928, owner=b42ee151db035df9, client=0x7f76e0006390, connection-id=CTX_ID:c4cf488c-2d8e-4f7c-87e9-a0cb1f2648cd-GRAPH_ID:0-PID:10850-HOST:test-PC_NAME:test3-client-0-RECON_NO:-0, blocked at 2019-11-25 08:30:12
posixlk.posixlk[2](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid = 22757, owner=f62dd9ff96cefaf5, client=0x7f76e8082100, connection-id=CTX_ID:7da20ab3-cc70-41bd-ab83-955481288ba2-GRAPH_ID:0-PID:22649-HOST:node2-PC_NAME:test3-client-0-RECON_NO:-0, blocked at 2019-11-25 08:30:13


Actual results:
both two repeat_flock.sh on two nodes will stuck, and the lock held forever

Expected results:
both two repeat_flock.sh on two nodes should not be stuck

Additional info:

--- Additional comment from zhou lin on 2019-11-28 11:41:39 MVT ---



--- Additional comment from zhou lin on 2019-11-29 08:08:45 MVT ---

i tried to add un_ref in grant_blocked_locks just before stack unwind, seems it works.

--- Additional comment from zhou lin on 2019-11-29 14:17:23 MVT ---

please review patch for this issue

--- Additional comment from Worker Ant on 2019-12-03 10:52:33 MVT ---

REVIEW: https://review.gluster.org/23794 (add clean local after grant lock) posted (#1) for review on master by None

Comment 1 Worker Ant 2019-12-04 01:32:14 UTC

REVIEW: https://review.gluster.org/23794 (add clean local after grant lock) posted (#5) for review on master by None

Comment 2 Worker Ant 2020-03-12 13:24:37 UTC

This bug is moved to https://github.com/gluster/glusterfs/issues/1011, and will be tracked there from now on. Visit GitHub issues URL for further details

Comment 3 Worker Ant 2020-04-01 06:35:00 UTC

REVIEW: https://review.gluster.org/23794 (add clean local after grant lock) merged (#8) on master by Xavi Hernandez

Note You need to log in before you can comment on or make changes to this bug.