Bug 1566579 - DHT Layout is missing on few bricks of a disperse sub-vol when rm -rf and mkdir are run in parallel
Summary: DHT Layout is missing on few bricks of a disperse sub-vol when rm -rf and mkd...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: disperse
Version: rhgs-3.4
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Ashish Pandey
QA Contact: Nag Pavan Chilakam
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-12 14:51 UTC by Prasad Desala
Modified: 2019-11-18 11:12 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-11-18 11:12:31 UTC
Target Upstream Version:
tdesala: needinfo-


Attachments (Terms of Use)
getfattr output of foo directory from all bricks (16.00 KB, text/plain)
2018-04-12 16:53 UTC, Prasad Desala
no flags Details

Description Prasad Desala 2018-04-12 14:51:37 UTC
Description of problem:
=======================
DHT Layout is missing on few bricks of a disperse sub-vol when rm -rf and mkdir are run in parallel from multiple clients.

getfattr -d -e hex -m glusterfs.dht /bricks/brick1/ec-b1/foo
getfattr: Removing leading '/' from absolute path names
# file: bricks/brick1/ec-b1/foo
trusted.glusterfs.dht=0x00000001000000006ffffffc7ffffffa
trusted.glusterfs.dht.mds=0x00000000

getfattr -d -e hex -m glusterfs.dht /bricks/brick1/ec-b1/foo
getfattr: /bricks/brick1/ec-b1/foo: No such file or directory  ---> layout missing on this brick

getfattr -d -e hex -m glusterfs.dht /bricks/brick1/ec-b1/foo
getfattr: Removing leading '/' from absolute path names
# file: bricks/brick1/ec-b1/foo
trusted.glusterfs.dht=0x00000001000000006ffffffc7ffffffa
trusted.glusterfs.dht.mds=0x00000000

getfattr -d -e hex -m glusterfs.dht /bricks/brick1/ec-b1/foo
getfattr: Removing leading '/' from absolute path names
# file: bricks/brick1/ec-b1/foo
trusted.glusterfs.dht=0x00000001000000006ffffffc7ffffffa
trusted.glusterfs.dht.mds=0x00000000

getfattr -d -e hex -m glusterfs.dht /bricks/brick1/ec-b1/foo
getfattr: Removing leading '/' from absolute path names
# file: bricks/brick1/ec-b1/foo
trusted.glusterfs.dht=0x00000001000000006ffffffc7ffffffa
trusted.glusterfs.dht.mds=0x00000000

getfattr -d -e hex -m glusterfs.dht /bricks/brick1/ec-b1/foo
getfattr: Removing leading '/' from absolute path names
# file: bricks/brick1/ec-b1/foo
trusted.glusterfs.dht=0x00000001000000006ffffffc7ffffffa
trusted.glusterfs.dht.mds=0x00000000


Version-Release number of selected component (if applicable):
3.12.2-7.el7rhgs.x86_64

How reproducible:
1/1

Steps to Reproduce:
===================
1) Create a Distributed-Disperse and start it.
2) FUSE mount it on multiple clients.
3) Create a directory structure as below,
mkdir -p foo/bar/goo
4) Run rm -rf * and mkdir 'foo' at same time.
Client-1: rm -rf *
Client-2: mkdir foo
Both above 2 commands should be run at once.
After executing the above commands, start running "mkdir foo" multiple times from the client until mkdir foo succeeds

Actual results:
===============
after some iterations, 
--> Layout is missing on few bricks of disperse sub-vol
--> rm -rf foo is failing with Input/output error
rm: cannot remove ‘foo’: Input/output error

Expected results:
=================
Layout is should be present on all the bricks of disperse sub-vol.

Comment 5 Prasad Desala 2018-04-12 16:53:56 UTC
Created attachment 1420954 [details]
getfattr output of foo directory from all bricks

Comment 12 Ashish Pandey 2018-04-16 11:43:08 UTC
(In reply to Prasad Desala from comment #0)
> Description of problem:
> =======================
> DHT Layout is missing on few bricks of a disperse sub-vol when rm -rf and
> mkdir are run in parallel from multiple clients.
> 
> getfattr -d -e hex -m glusterfs.dht /bricks/brick1/ec-b1/foo
> getfattr: Removing leading '/' from absolute path names
> # file: bricks/brick1/ec-b1/foo
> trusted.glusterfs.dht=0x00000001000000006ffffffc7ffffffa
> trusted.glusterfs.dht.mds=0x00000000
> 
> getfattr -d -e hex -m glusterfs.dht /bricks/brick1/ec-b1/foo
> getfattr: /bricks/brick1/ec-b1/foo: No such file or directory  ---> layout
> missing on this brick

Just a note- 

This dir is also present and have layout.
# file: bricks/brick0/ec1-b1/foo
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.ec.version=0x00000000000001e7000000000000024a
trusted.gfid=0x3749c883f72b4e9ab7ed214729c326ab
trusted.glusterfs.dht=0x00000001000000006ffffffc7ffffffa
trusted.glusterfs.dht.mds=0x00000000

This is the only *brick* which is having different path in this volume.

bricks/brick0/ec1-b1/foo while it should be bricks/brick1/ec-b1/foo

That's the reason it was not coming up while we used brick*/ec-b* to get xattrs






> 
> getfattr -d -e hex -m glusterfs.dht /bricks/brick1/ec-b1/foo
> getfattr: Removing leading '/' from absolute path names
> # file: bricks/brick1/ec-b1/foo
> trusted.glusterfs.dht=0x00000001000000006ffffffc7ffffffa
> trusted.glusterfs.dht.mds=0x00000000
> 
> getfattr -d -e hex -m glusterfs.dht /bricks/brick1/ec-b1/foo
> getfattr: Removing leading '/' from absolute path names
> # file: bricks/brick1/ec-b1/foo
> trusted.glusterfs.dht=0x00000001000000006ffffffc7ffffffa
> trusted.glusterfs.dht.mds=0x00000000
> 
> getfattr -d -e hex -m glusterfs.dht /bricks/brick1/ec-b1/foo
> getfattr: Removing leading '/' from absolute path names
> # file: bricks/brick1/ec-b1/foo
> trusted.glusterfs.dht=0x00000001000000006ffffffc7ffffffa
> trusted.glusterfs.dht.mds=0x00000000
> 
> getfattr -d -e hex -m glusterfs.dht /bricks/brick1/ec-b1/foo
> getfattr: Removing leading '/' from absolute path names
> # file: bricks/brick1/ec-b1/foo
> trusted.glusterfs.dht=0x00000001000000006ffffffc7ffffffa
> trusted.glusterfs.dht.mds=0x00000000
> 
> 
> Version-Release number of selected component (if applicable):
> 3.12.2-7.el7rhgs.x86_64
> 
> How reproducible:
> 1/1
> 
> Steps to Reproduce:
> ===================
> 1) Create a Distributed-Disperse and start it.
> 2) FUSE mount it on multiple clients.
> 3) Create a directory structure as below,
> mkdir -p foo/bar/goo
> 4) Run rm -rf * and mkdir 'foo' at same time.
> Client-1: rm -rf *
> Client-2: mkdir foo
> Both above 2 commands should be run at once.
> After executing the above commands, start running "mkdir foo" multiple times
> from the client until mkdir foo succeeds
> 
> Actual results:
> ===============
> after some iterations, 
> --> Layout is missing on few bricks of disperse sub-vol
> --> rm -rf foo is failing with Input/output error
> rm: cannot remove ‘foo’: Input/output error
> 
> Expected results:
> =================
> Layout is should be present on all the bricks of disperse sub-vol.

Comment 19 Atin Mukherjee 2018-11-09 11:13:39 UTC
Has this been hit during RHGS 3.4 regression testing ? If not, can this be closed please?


Note You need to log in before you can comment on or make changes to this bug.