1345911 – locks on file in dist-disperse not released leading to IO hangs

Bug 1345911 - locks on file in dist-disperse not released leading to IO hangs

Summary: locks on file in dist-disperse not released leading to IO hangs

Keywords:
Status:	CLOSED WORKSFORME
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	disperse
Sub Component:
Version:	rhgs-3.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Sunil Kumar Acharya
QA Contact:	Matt Zywusko
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1330997
TreeView+	depends on / blocked

Reported:	2016-06-13 12:29 UTC by Nag Pavan Chilakam
Modified:	2019-04-03 09:28 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-09-20 09:01:39 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Nag Pavan Chilakam 2016-06-13 12:29:43 UTC

When We try to access data from an EC-volume with ganesha mount, using ls -LRt
the response can hang due to file locks being taken on 4 of the 6 bricks and which end up not getting released.

In this case I am just running ls -lRt on the client without doing any other IOs(single point mount)

Details of IO Etc which were done previously on the vol can  be found in bugs like
1) 1330997 - [Disperse volume]: IO hang seen on mount with file ops 
2)1344675 - Stale file handle seen on the mount of dist-disperse volume when doing IOs with nfs-ganesha protocol 
3)1342426 - self heal deamon killed due to oom kills on a dist-disperse volume using nfs ganesha 




[root@dhcp35-98 log]# rpm -qa|grep gluster
glusterfs-client-xlators-3.7.9-9.el7rhgs.x86_64
nfs-ganesha-gluster-2.3.1-8.el7rhgs.x86_64
glusterfs-server-3.7.9-9.el7rhgs.x86_64
glusterfs-api-3.7.9-9.el7rhgs.x86_64
glusterfs-debuginfo-3.7.9-9.el7rhgs.x86_64
glusterfs-libs-3.7.9-9.el7rhgs.x86_64
glusterfs-fuse-3.7.9-9.el7rhgs.x86_64
glusterfs-cli-3.7.9-9.el7rhgs.x86_64
glusterfs-3.7.9-9.el7rhgs.x86_64
glusterfs-ganesha-3.7.9-9.el7rhgs.x86_64
python-gluster-3.7.9-9.el7rhgs.noarch
[root@dhcp35-98 log]# rpm -qa|grep ganesha
nfs-ganesha-gluster-2.3.1-8.el7rhgs.x86_64
nfs-ganesha-2.3.1-8.el7rhgs.x86_64
nfs-ganesha-debuginfo-2.3.1-8.el7rhgs.x86_64
glusterfs-ganesha-3.7.9-9.el7rhgs.x86_64
[root@dhcp35-98 log]# rpm -qa|grep pcs
pcs-0.9.143-15.el7.x86_64

Comment 3 Sunil Kumar Acharya 2017-09-20 09:01:39 UTC

We tried to recreate the issue having folloiwng packages on the setup.

glusterfs-server-3.8.4-44.el7rhgs.x86_64
nfs-ganesha-2.4.4-17.el7rhgs.x86_64

We didn't see any hang with and without IO on the mount point(mounted using
different VIPs).

Steps followed:

1. Create an EC volume(4+2).
2. Mounted the volume using NFS-Ganesha on 3 different clients.
3. Initiated kernel archive untar(first client) and created 200 directories with 50 files(second client)
   each on the mount point. ls -lRt was run from the third client.
4. Tried ls -lRt from all the three mount points.

Closing this issue as discussed during issue verification.

Note You need to log in before you can comment on or make changes to this bug.