Bug 1854165

Summary: gluster does not release posix lock when multiple glusterfs clients do flock -xo on the same file in parallel
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Cal Calhoun <ccalhoun>
Component: locksAssignee: Xavi Hernandez <jahernan>
Status: CLOSED ERRATA QA Contact: milind <mwaykole>
Severity: high Docs Contact:
Priority: medium    
Version: rhgs-3.5CC: bkunal, jahernan, mwaykole, nchilaka, nravinas, pprakash, puebele, rhs-bugs, rkothiya, sheggodu, smulay
Target Milestone: ---Keywords: ZStream
Target Release: RHGS 3.5.z Batch Update 3   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-6.0-40 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-12-17 04:51:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Cal Calhoun 2020-07-06 15:47:17 UTC
Description of problem:

  Customer seems to be experiencing the bug described in these bugzilla reports in their RHGS 3.5 cluster:

	https://github.com/gluster/glusterfs/issues/982  (BZ 1718562)
	https://github.com/gluster/glusterfs/issues/1046 (BZ 1776152)

   ...and have used the procedures in [bug:1718562] flock failure (regression) #982 to duplicate the issues reported there.  

Version-Release number of selected component (if applicable):

  Clients:

    glusterfs-6.0-22.el6.x86_64
    glusterfs-api-6.0-22.el6.x86_64
    glusterfs-client-xlators-6.0-22.el6.x86_64
    glusterfs-fuse-6.0-22.el6.x86_64
    glusterfs-libs-6.0-22.el6.x86_64

  Servers:

    glusterfs-6.0-30.1.el7rhgs.x86_64
    glusterfs-api-6.0-30.1.el7rhgs.x86_64
    glusterfs-client-xlators-6.0-30.1.el7rhgs.x86_64
    glusterfs-cli-6.0-30.1.el7rhgs.x86_64
    glusterfs-events-6.0-30.1.el7rhgs.x86_64
    glusterfs-fuse-6.0-30.1.el7rhgs.x86_64  
    glusterfs-geo-replication-6.0-30.1.el7rhgs.x86_64
    glusterfs-libs-6.0-30.1.el7rhgs.x86_64
    glusterfs-rdma-6.0-30.1.el7rhgs.x86_64
    glusterfs-server-6.0-30.1.el7rhgs.x86_64
    gluster-nagios-addons-0.2.10-2.el7rhgs.x86_64
    gluster-nagios-common-0.2.4-1.el7rhgs.noarch

How reproducible:

  On demand

Steps to Reproduce:

  Per BZ 1718562

Additional info:

  Client and server sosreports on support-shell (/cases/02682288)

  Customer has also supplied: 

    wb_straces.zip - straces of php-cgi processes from an affected application
    statedump.home.zip - statedumps taken while the wb_straces were running

  Customer has asked for a hotfix when available and I have set expectations that this is unlikely however it would be good to get this fixed and in an upcoming RHGS release as it is causing ongoing problems for them.

  Customer is very motivated and would cooperate in any way.

  Let me know how else I can assist.

Comment 1 Csaba Henk 2020-07-06 18:24:36 UTC
Passing on the bug to Susant Palai who deals with the upstream bug.

Comment 25 milind 2020-09-22 14:47:57 UTC
Steps:
 1. create all types of volume
 2. mount the brick on two different node
 3.prepare same script to do flock on the two clients  

#!/bin/bash
flock_func(){
#!/bin/bash
file=/bricks/brick0/test.log
touch $file
(
         flock -xo 200
         echo "client1 do something" > $file
         sleep 1
 ) 200>$file
}
i=1
while [ "1" = "1" ]
do
    flock_func
    ((i=i+1))
    echo $i
    if [[ $i == 200 ]]; then
            break
    fi
done 

4. waited till 300 iteration
------------------
Additional info
[node.example.com]#rpm -qa | grep -i glusterfs
glusterfs-6.0-45.el8rhgs.x86_64
glusterfs-fuse-6.0-45.el8rhgs.x86_64
glusterfs-api-6.0-45.el8rhgs.x86_64
glusterfs-selinux-1.0-1.el8rhgs.noarch
glusterfs-client-xlators-6.0-45.el8rhgs.x86_64
glusterfs-server-6.0-45.el8rhgs.x86_64
glusterfs-cli-6.0-45.el8rhgs.x86_64
glusterfs-libs-6.0-45.el8rhgs.x86_64


As i don't see any issue while running script till 300 iterations marking this bug as verified

Comment 26 nravinas 2020-09-23 11:41:49 UTC
*** Bug 1880271 has been marked as a duplicate of this bug. ***

Comment 35 Xavi Hernandez 2020-10-22 13:40:03 UTC
*** Bug 1852740 has been marked as a duplicate of this bug. ***

Comment 36 Xavi Hernandez 2020-10-29 11:10:26 UTC
*** Bug 1851315 has been marked as a duplicate of this bug. ***

Comment 38 errata-xmlrpc 2020-12-17 04:51:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (glusterfs bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:5603