Description of problem: ======================= DHT Layout is missing on few bricks of a disperse sub-vol when rm -rf and mkdir are run in parallel from multiple clients. getfattr -d -e hex -m glusterfs.dht /bricks/brick1/ec-b1/foo getfattr: Removing leading '/' from absolute path names # file: bricks/brick1/ec-b1/foo trusted.glusterfs.dht=0x00000001000000006ffffffc7ffffffa trusted.glusterfs.dht.mds=0x00000000 getfattr -d -e hex -m glusterfs.dht /bricks/brick1/ec-b1/foo getfattr: /bricks/brick1/ec-b1/foo: No such file or directory ---> layout missing on this brick getfattr -d -e hex -m glusterfs.dht /bricks/brick1/ec-b1/foo getfattr: Removing leading '/' from absolute path names # file: bricks/brick1/ec-b1/foo trusted.glusterfs.dht=0x00000001000000006ffffffc7ffffffa trusted.glusterfs.dht.mds=0x00000000 getfattr -d -e hex -m glusterfs.dht /bricks/brick1/ec-b1/foo getfattr: Removing leading '/' from absolute path names # file: bricks/brick1/ec-b1/foo trusted.glusterfs.dht=0x00000001000000006ffffffc7ffffffa trusted.glusterfs.dht.mds=0x00000000 getfattr -d -e hex -m glusterfs.dht /bricks/brick1/ec-b1/foo getfattr: Removing leading '/' from absolute path names # file: bricks/brick1/ec-b1/foo trusted.glusterfs.dht=0x00000001000000006ffffffc7ffffffa trusted.glusterfs.dht.mds=0x00000000 getfattr -d -e hex -m glusterfs.dht /bricks/brick1/ec-b1/foo getfattr: Removing leading '/' from absolute path names # file: bricks/brick1/ec-b1/foo trusted.glusterfs.dht=0x00000001000000006ffffffc7ffffffa trusted.glusterfs.dht.mds=0x00000000 Version-Release number of selected component (if applicable): 3.12.2-7.el7rhgs.x86_64 How reproducible: 1/1 Steps to Reproduce: =================== 1) Create a Distributed-Disperse and start it. 2) FUSE mount it on multiple clients. 3) Create a directory structure as below, mkdir -p foo/bar/goo 4) Run rm -rf * and mkdir 'foo' at same time. Client-1: rm -rf * Client-2: mkdir foo Both above 2 commands should be run at once. After executing the above commands, start running "mkdir foo" multiple times from the client until mkdir foo succeeds Actual results: =============== after some iterations, --> Layout is missing on few bricks of disperse sub-vol --> rm -rf foo is failing with Input/output error rm: cannot remove ‘foo’: Input/output error Expected results: ================= Layout is should be present on all the bricks of disperse sub-vol.
Created attachment 1420954 [details] getfattr output of foo directory from all bricks
(In reply to Prasad Desala from comment #0) > Description of problem: > ======================= > DHT Layout is missing on few bricks of a disperse sub-vol when rm -rf and > mkdir are run in parallel from multiple clients. > > getfattr -d -e hex -m glusterfs.dht /bricks/brick1/ec-b1/foo > getfattr: Removing leading '/' from absolute path names > # file: bricks/brick1/ec-b1/foo > trusted.glusterfs.dht=0x00000001000000006ffffffc7ffffffa > trusted.glusterfs.dht.mds=0x00000000 > > getfattr -d -e hex -m glusterfs.dht /bricks/brick1/ec-b1/foo > getfattr: /bricks/brick1/ec-b1/foo: No such file or directory ---> layout > missing on this brick Just a note- This dir is also present and have layout. # file: bricks/brick0/ec1-b1/foo security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.ec.version=0x00000000000001e7000000000000024a trusted.gfid=0x3749c883f72b4e9ab7ed214729c326ab trusted.glusterfs.dht=0x00000001000000006ffffffc7ffffffa trusted.glusterfs.dht.mds=0x00000000 This is the only *brick* which is having different path in this volume. bricks/brick0/ec1-b1/foo while it should be bricks/brick1/ec-b1/foo That's the reason it was not coming up while we used brick*/ec-b* to get xattrs > > getfattr -d -e hex -m glusterfs.dht /bricks/brick1/ec-b1/foo > getfattr: Removing leading '/' from absolute path names > # file: bricks/brick1/ec-b1/foo > trusted.glusterfs.dht=0x00000001000000006ffffffc7ffffffa > trusted.glusterfs.dht.mds=0x00000000 > > getfattr -d -e hex -m glusterfs.dht /bricks/brick1/ec-b1/foo > getfattr: Removing leading '/' from absolute path names > # file: bricks/brick1/ec-b1/foo > trusted.glusterfs.dht=0x00000001000000006ffffffc7ffffffa > trusted.glusterfs.dht.mds=0x00000000 > > getfattr -d -e hex -m glusterfs.dht /bricks/brick1/ec-b1/foo > getfattr: Removing leading '/' from absolute path names > # file: bricks/brick1/ec-b1/foo > trusted.glusterfs.dht=0x00000001000000006ffffffc7ffffffa > trusted.glusterfs.dht.mds=0x00000000 > > getfattr -d -e hex -m glusterfs.dht /bricks/brick1/ec-b1/foo > getfattr: Removing leading '/' from absolute path names > # file: bricks/brick1/ec-b1/foo > trusted.glusterfs.dht=0x00000001000000006ffffffc7ffffffa > trusted.glusterfs.dht.mds=0x00000000 > > > Version-Release number of selected component (if applicable): > 3.12.2-7.el7rhgs.x86_64 > > How reproducible: > 1/1 > > Steps to Reproduce: > =================== > 1) Create a Distributed-Disperse and start it. > 2) FUSE mount it on multiple clients. > 3) Create a directory structure as below, > mkdir -p foo/bar/goo > 4) Run rm -rf * and mkdir 'foo' at same time. > Client-1: rm -rf * > Client-2: mkdir foo > Both above 2 commands should be run at once. > After executing the above commands, start running "mkdir foo" multiple times > from the client until mkdir foo succeeds > > Actual results: > =============== > after some iterations, > --> Layout is missing on few bricks of disperse sub-vol > --> rm -rf foo is failing with Input/output error > rm: cannot remove ‘foo’: Input/output error > > Expected results: > ================= > Layout is should be present on all the bricks of disperse sub-vol.
Has this been hit during RHGS 3.4 regression testing ? If not, can this be closed please?