Bug 1362420 - [Disperse] dd + rm + ls lead to IO hang
Summary: [Disperse] dd + rm + ls lead to IO hang
Keywords:
Status: CLOSED DUPLICATE of bug 1361519
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: disperse
Version: rhgs-3.1
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
: ---
Assignee: Ravishankar N
QA Contact: storage-qa-internal@redhat.com
URL:
Whiteboard:
Depends On: 1346719 1371397 1373392
Blocks: 1361519
TreeView+ depends on / blocked
 
Reported: 2016-08-02 07:31 UTC by Ravishankar N
Modified: 2016-09-06 07:34 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1346719
Environment:
Last Closed: 2016-08-02 07:35:24 UTC
Embargoed:


Attachments (Terms of Use)

Description Ravishankar N 2016-08-02 07:31:41 UTC
+++ This bug was initially created as a clone of Bug #1346719 +++

Description of problem:

Creation of files and ls gets hanged while trying to do rm -rf in infinite loop

Version-Release number of selected component (if applicable):
[root@apandey gluster]# glusterfs --version
glusterfs 3.9dev built on Jun 15 2016 11:39:11
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2013 Red Hat, Inc. <http://www.redhat.com/>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
It is licensed to you under your choice of the GNU Lesser
General Public License, version 3 or any later version (LGPLv3
or later), or the GNU General Public License, version 2 (GPLv2),
in all cases as published by the Free Software Foundation.


How reproducible:
1/1

Steps to Reproduce:
1.  Create a disperse volume.
2. Mount this volume on 3 mount points- m1, m2 , m3
3. Create 10000 file on m1 using for and dd. After some time start rm -rf on m2 in an infinite loop. Start ls -lRT on m3 

Actual results:
IO Hang has been seen. on m1, m3. 

Expected results:
There should not be any hang.

Additional info:

Volume Name: vol
Type: Disperse
Volume ID: c81743b4-ab0e-4d9b-931b-4d67f4d24a75
Status: Started
Number of Bricks: 1 x (4 + 2) = 6
Transport-type: tcp
Bricks:
Brick1: apandey:/brick/gluster/vol-1
Brick2: apandey:/brick/gluster/vol-2
Brick3: apandey:/brick/gluster/vol-3
Brick4: apandey:/brick/gluster/vol-4
Brick5: apandey:/brick/gluster/vol-5
Brick6: apandey:/brick/gluster/vol-6
Options Reconfigured:
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: off
Status of volume: vol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick apandey:/brick/gluster/vol-1          49152     0          Y       13179
Brick apandey:/brick/gluster/vol-2          49153     0          Y       13198
Brick apandey:/brick/gluster/vol-3          49154     0          Y       13217
Brick apandey:/brick/gluster/vol-4          49155     0          Y       13236
Brick apandey:/brick/gluster/vol-5          49156     0          Y       13255
Brick apandey:/brick/gluster/vol-6          49157     0          Y       13274
NFS Server on localhost                     N/A       N/A        N       N/A  
Self-heal Daemon on localhost               N/A       N/A        Y       13302
 
Task Status of Volume vol
------------------------------------------------------------------------------
There are no active volume tasks
 
[root@apandey gluster]#  mount

usectl on /sys/fs/fuse/connections type fusectl (rw,relatime)
apandey:vol on /mnt/glu type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
apandey:vol on /mnt/gfs type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
apandey:vol on /mnt/vol type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
[root@apandey gluster]#

--- Additional comment from Ashish Pandey on 2016-06-15 05:01:07 EDT ---

statedump shows some blocked inodelk - 

[conn.1.bound_xl./brick/gluster/vol-1.active.1]
gfid=00000000-0000-0000-0000-000000000001
nlookup=3
fd-count=3
ref=1
ia_type=2

[xlator.features.locks.vol-locks.inode]
path=/
mandatory=0
inodelk-count=3
lock-dump.domain.domain=dht.layout.heal
lock-dump.domain.domain=vol-disperse-0:self-heal
lock-dump.domain.domain=vol-disperse-0
inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, pid = 3327, owner=dc710738fd7e0000, client=0x7f283c1a7b00, connection-id=apandey-15766-2016/06/15-07:59:38:894408-vol-client-0-0-0, blocked at 2016-06-15 08:02:13, granted at 2016-06-15 08:02:13
inodelk.inodelk[1](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid = 22451, owner=cc338ae8f07f0000, client=0x7f2834006660, connection-id=apandey-13531-2016/06/15-07:58:50:360055-vol-client-0-0-0, blocked at 2016-06-15 08:02:13
inodelk.inodelk[2](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid = 22530, owner=6cd51d48da7f0000, client=0x7f28342db820, connection-id=apandey-19856-2016/06/15-08:01:05:258794-vol-client-0-0-0, blocked at 2016-06-15 08:02:22

--- Additional comment from Ashish Pandey on 2016-06-15 05:08:42 EDT ---


Just observed that option disperse.eager-lock has come to rescue- 
Setting disperse.eager-lock to off started IO's and ls -lR command.

gluster v set vol disperse.eager-lock off

Comment 2 Ravishankar N 2016-08-02 07:35:24 UTC

*** This bug has been marked as a duplicate of bug 1361519 ***


Note You need to log in before you can comment on or make changes to this bug.