Bug 1345911

Summary:	locks on file in dist-disperse not released leading to IO hangs
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	Nag Pavan Chilakam <nchilaka>
Component:	disperse	Assignee:	Sunil Kumar Acharya <sheggodu>
Status:	CLOSED WORKSFORME	QA Contact:	Matt Zywusko <mzywusko>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	rhgs-3.1	CC:	aspandey, hswong3i, pkarampu, rhs-bugs, ubansal
Target Milestone:	---	Keywords:	ZStream
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-09-20 09:01:39 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1330997

Description Nag Pavan Chilakam 2016-06-13 12:29:43 UTC

When We try to access data from an EC-volume with ganesha mount, using ls -LRt
the response can hang due to file locks being taken on 4 of the 6 bricks and which end up not getting released.

In this case I am just running ls -lRt on the client without doing any other IOs(single point mount)

Details of IO Etc which were done previously on the vol can  be found in bugs like
1) 1330997 - [Disperse volume]: IO hang seen on mount with file ops 
2)1344675 - Stale file handle seen on the mount of dist-disperse volume when doing IOs with nfs-ganesha protocol 
3)1342426 - self heal deamon killed due to oom kills on a dist-disperse volume using nfs ganesha 




[root@dhcp35-98 log]# rpm -qa|grep gluster
glusterfs-client-xlators-3.7.9-9.el7rhgs.x86_64
nfs-ganesha-gluster-2.3.1-8.el7rhgs.x86_64
glusterfs-server-3.7.9-9.el7rhgs.x86_64
glusterfs-api-3.7.9-9.el7rhgs.x86_64
glusterfs-debuginfo-3.7.9-9.el7rhgs.x86_64
glusterfs-libs-3.7.9-9.el7rhgs.x86_64
glusterfs-fuse-3.7.9-9.el7rhgs.x86_64
glusterfs-cli-3.7.9-9.el7rhgs.x86_64
glusterfs-3.7.9-9.el7rhgs.x86_64
glusterfs-ganesha-3.7.9-9.el7rhgs.x86_64
python-gluster-3.7.9-9.el7rhgs.noarch
[root@dhcp35-98 log]# rpm -qa|grep ganesha
nfs-ganesha-gluster-2.3.1-8.el7rhgs.x86_64
nfs-ganesha-2.3.1-8.el7rhgs.x86_64
nfs-ganesha-debuginfo-2.3.1-8.el7rhgs.x86_64
glusterfs-ganesha-3.7.9-9.el7rhgs.x86_64
[root@dhcp35-98 log]# rpm -qa|grep pcs
pcs-0.9.143-15.el7.x86_64

Comment 3 Sunil Kumar Acharya 2017-09-20 09:01:39 UTC

We tried to recreate the issue having folloiwng packages on the setup.

glusterfs-server-3.8.4-44.el7rhgs.x86_64
nfs-ganesha-2.4.4-17.el7rhgs.x86_64

We didn't see any hang with and without IO on the mount point(mounted using
different VIPs).

Steps followed:

1. Create an EC volume(4+2).
2. Mounted the volume using NFS-Ganesha on 3 different clients.
3. Initiated kernel archive untar(first client) and created 200 directories with 50 files(second client)
   each on the mount point. ls -lRt was run from the third client.
4. Tried ls -lRt from all the three mount points.

Closing this issue as discussed during issue verification.