Bug 1715438 - directories going into split-brain
Summary: directories going into split-brain
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: replicate
Version: rhgs-3.5
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: RHGS 3.5.0
Assignee: Karthik U S
QA Contact: Nag Pavan Chilakam
URL:
Whiteboard:
Depends On:
Blocks: 1696809
TreeView+ depends on / blocked
 
Reported: 2019-05-30 11:13 UTC by Nag Pavan Chilakam
Modified: 2019-10-30 13:52 UTC (History)
8 users (show)

Fixed In Version: glusterfs-6.0-5
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-10-30 12:21:50 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2019:3249 0 None None None 2019-10-30 12:22:07 UTC

Description Nag Pavan Chilakam 2019-05-30 11:13:51 UTC
Description of problem:
=======================
directories are going into split-brain when I do a node reboot.
On my system test bed, I was creating same directory structure from multiple clients. I did  a node reboot and that is causing some directories to go into splitbrains


Version-Release number of selected component (if applicable):
===============
6.0.3 on rhel7.7beta

How reproducible:
============
2/2

Steps to Reproduce:
================
1. created a 4x3 volume
2. mounted volume on 10 clients
3. differnt kind of IOs are going on
4. under  one directory(which is not affected by above IOs), started to create multi breadth-depth directory structure from multiple clients

5. did a node reboot

Actual results:
===========
few directories in splitbrain

Expected results:
===============
no split-brain expected


[root@rhs-gp-srv7 ~]# gluster v info
 
Volume Name: nftvol
Type: Distributed-Replicate
Volume ID: bb49cec5-d750-4e2f-a332-8da43efaf2d3
Status: Started
Snapshot Count: 0
Number of Bricks: 4 x 3 = 12
Transport-type: tcp
Bricks:
Brick1: rhs-gp-srv7.lab.eng.blr.redhat.com:/gluster/brick1/nftvol-b1
Brick2: rhs-gp-srv8.lab.eng.blr.redhat.com:/gluster/brick1/nftvol-b1
Brick3: rhs-gp-srv9.lab.eng.blr.redhat.com:/gluster/brick1/nftvol-b1
Brick4: rhs-gp-srv8.lab.eng.blr.redhat.com:/gluster/brick2/nftvol-b2
Brick5: rhs-gp-srv9.lab.eng.blr.redhat.com:/gluster/brick2/nftvol-b2
Brick6: rhs-gp-srv10.lab.eng.blr.redhat.com:/gluster/brick1/nftvol-b2
Brick7: rhs-gp-srv9.lab.eng.blr.redhat.com:/gluster/brick3/nftvol-b3
Brick8: rhs-gp-srv10.lab.eng.blr.redhat.com:/gluster/brick2/nftvol-b3
Brick9: rhs-gp-srv7.lab.eng.blr.redhat.com:/gluster/brick2/nftvol-b3
Brick10: rhs-gp-srv10.lab.eng.blr.redhat.com:/gluster/brick3/nftvol-b4
Brick11: rhs-gp-srv7.lab.eng.blr.redhat.com:/gluster/brick3/nftvol-b4
Brick12: rhs-gp-srv8.lab.eng.blr.redhat.com:/gluster/brick3/nftvol-b4
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
features.quota: on
features.inode-quota: on
features.quota-deem-statfs: on


Note: refer to  "/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.7/level5.60" 




[root@rhs-gp-srv7 ~]# gluster ^C
[root@rhs-gp-srv7 ~]# gluster  v heal nftvol info
Brick rhs-gp-srv7.lab.eng.blr.redhat.com:/gluster/brick1/nftvol-b1
/IOs/kernel/dhcp43-46.lab.eng.blr.redhat.com/dir.41/linux-5.1.3/include 
<gfid:eabe957a-69a4-44e8-80bf-91ed7ede4f93> - Is in split-brain
/IOs/kernel/dhcp43-21.lab.eng.blr.redhat.com/dir.37/linux-5.1.3/drivers/oprofile 
/IOs/samedir-creates/level1.1/level2.3/level3.28/level4.13/level5.9 
Status: Connected
Number of entries: 4

Brick rhs-gp-srv8.lab.eng.blr.redhat.com:/gluster/brick1/nftvol-b1
/IOs/samedir-creates/level1.1/level2.3/level3.28/level4.13 
/IOs/samedir-creates/level1.1/level2.3/level3.28/level4.13/level5.9 - Is in split-brain
Status: Connected
Number of entries: 2

Brick rhs-gp-srv9.lab.eng.blr.redhat.com:/gluster/brick1/nftvol-b1
/IOs/samedir-creates/level1.1/level2.3/level3.28/level4.13/level5.9 
/IOs/kernel/dhcp43-234.lab.eng.blr.redhat.com/dir.37/linux-5.1.3/drivers/char 
Status: Connected
Number of entries: 2

Brick rhs-gp-srv8.lab.eng.blr.redhat.com:/gluster/brick2/nftvol-b2
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.7/level5.60 - Is in split-brain
Status: Connected
Number of entries: 1

Brick rhs-gp-srv9.lab.eng.blr.redhat.com:/gluster/brick2/nftvol-b2
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.7/level5.60 - Is in split-brain
/IOs/kernel/dhcp43-234.lab.eng.blr.redhat.com/dir.37/linux-5.1.3/drivers/char 
Status: Connected
Number of entries: 2

Brick rhs-gp-srv10.lab.eng.blr.redhat.com:/gluster/brick1/nftvol-b2
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.7/level5.60 - Is in split-brain
Status: Connected
Number of entries: 1

Brick rhs-gp-srv9.lab.eng.blr.redhat.com:/gluster/brick3/nftvol-b3
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.33/level5.58 
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.33 
Status: Connected
Number of entries: 2

Brick rhs-gp-srv10.lab.eng.blr.redhat.com:/gluster/brick2/nftvol-b3
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.35/level5.97 
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.35 
Status: Connected
Number of entries: 2

Brick rhs-gp-srv7.lab.eng.blr.redhat.com:/gluster/brick2/nftvol-b3
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.33/level5.58 
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.33 
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.35/level5.97 
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.35 
Status: Connected
Number of entries: 4

Brick rhs-gp-srv10.lab.eng.blr.redhat.com:/gluster/brick3/nftvol-b4
Status: Connected
Number of entries: 0

Brick rhs-gp-srv7.lab.eng.blr.redhat.com:/gluster/brick3/nftvol-b4
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.32/level5.3 
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.32 
Status: Connected
Number of entries: 2

Brick rhs-gp-srv8.lab.eng.blr.redhat.com:/gluster/brick3/nftvol-b4
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.32/level5.3 
/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.32 
Status: Connected
Number of entries: 2

Comment 3 Nag Pavan Chilakam 2019-05-30 12:25:08 UTC
sosreports @ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/nchilaka/bug.1715438/

n1:
[root@rhs-gp-srv7 ~]# getfattr -d -m . -e hex /gluster/brick*/nftvol*/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.7/level5.60
getfattr: Removing leading '/' from absolute path names
# file: gluster/brick1/nftvol-b1/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.7/level5.60
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.nftvol-client-1=0x000000000000000000000000
trusted.gfid=0x05184e8d5e4349dd9b51080444feb799
trusted.glusterfs.dht=0x00000001000000007f8b5a1cbf510729
trusted.glusterfs.mdata=0x010000000000000000000000005cefb41b000000002c07752e000000005cefb41b000000002c07752e000000005cefb41b000000002c07752e
trusted.glusterfs.quota.588a733c-9b13-4824-af86-0a3a037c2bc5.contri.1=0x000000000000000000000000000000000000000000000001
trusted.glusterfs.quota.size.1=0x000000000000000000000000000000000000000000000001

# file: gluster/brick2/nftvol-b3/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.7/level5.60
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.gfid=0x05184e8d5e4349dd9b51080444feb799
trusted.glusterfs.dht=0x0000000100000000000000003fc5ad0d
trusted.glusterfs.mdata=0x010000000000000000000000005cefb41b000000002c07752e000000005cefb41b000000002c07752e000000005cefb41b000000002c07752e
trusted.glusterfs.quota.588a733c-9b13-4824-af86-0a3a037c2bc5.contri.1=0x000000000000000000000000000000000000000000000001
trusted.glusterfs.quota.size.1=0x000000000000000000000000000000000000000000000001

# file: gluster/brick3/nftvol-b4/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.7/level5.60
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.nftvol-client-11=0x000000000000000000000000
trusted.gfid=0x05184e8d5e4349dd9b51080444feb799
trusted.glusterfs.dht=0x00000001000000003fc5ad0e7f8b5a1b
trusted.glusterfs.mdata=0x010000000000000000000000005cefb41b000000002c07752e000000005cefb41b000000002c07752e000000005cefb41b000000002c07752e
trusted.glusterfs.quota.588a733c-9b13-4824-af86-0a3a037c2bc5.contri.1=0x000000000000000000000000000000000000000000000001
trusted.glusterfs.quota.size.1=0x000000000000000000000000000000000000000000000001



n2:
[root@rhs-gp-srv8 ~]# getfattr -d -m . -e hex /gluster/brick*/nftvol*/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.7/level5.60
getfattr: Removing leading '/' from absolute path names
# file: gluster/brick1/nftvol-b1/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.7/level5.60
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.gfid=0x05184e8d5e4349dd9b51080444feb799
trusted.glusterfs.dht=0x00000001000000007f8b5a1cbf510729
trusted.glusterfs.mdata=0x010000000000000000000000005cefb41b000000002c07752e000000005cefb41b000000002c07752e000000005cefb41b000000002c07752e
trusted.glusterfs.quota.588a733c-9b13-4824-af86-0a3a037c2bc5.contri.1=0x000000000000000000000000000000000000000000000001
trusted.glusterfs.quota.size.1=0x000000000000000000000000000000000000000000000001

# file: gluster/brick2/nftvol-b2/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.7/level5.60
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.nftvol-client-4=0x000000000000000100000000
trusted.afr.nftvol-client-5=0x000000000000000100000000
trusted.gfid=0x05184e8d5e4349dd9b51080444feb799
trusted.glusterfs.quota.588a733c-9b13-4824-af86-0a3a037c2bc5.contri.1=0x000000000000000000000000000000000000000000000001
trusted.glusterfs.quota.size.1=0x000000000000000000000000000000000000000000000001

# file: gluster/brick3/nftvol-b4/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.7/level5.60
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.gfid=0x05184e8d5e4349dd9b51080444feb799
trusted.glusterfs.dht=0x00000001000000003fc5ad0e7f8b5a1b
trusted.glusterfs.mdata=0x010000000000000000000000005cefb41b000000002c07752e000000005cefb41b000000002c07752e000000005cefb41b000000002c07752e
trusted.glusterfs.quota.588a733c-9b13-4824-af86-0a3a037c2bc5.contri.1=0x000000000000000000000000000000000000000000000001
trusted.glusterfs.quota.size.1=0x000000000000000000000000000000000000000000000001



n3:
[root@rhs-gp-srv9 ~]# getfattr -d -m . -e hex /gluster/brick*/nftvol*/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.7/level5.60
getfattr: Removing leading '/' from absolute path names
# file: gluster/brick1/nftvol-b1/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.7/level5.60
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.nftvol-client-1=0x000000000000000000000000
trusted.gfid=0x05184e8d5e4349dd9b51080444feb799
trusted.glusterfs.dht=0x00000001000000007f8b5a1cbf510729
trusted.glusterfs.mdata=0x010000000000000000000000005cefb41b000000002c07752e000000005cefb41b000000002c07752e000000005cefb41b000000002c07752e
trusted.glusterfs.quota.588a733c-9b13-4824-af86-0a3a037c2bc5.contri.1=0x000000000000000000000000000000000000000000000001
trusted.glusterfs.quota.size.1=0x000000000000000000000000000000000000000000000001

# file: gluster/brick2/nftvol-b2/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.7/level5.60
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.nftvol-client-3=0x000000000000000200000000
trusted.gfid=0x05184e8d5e4349dd9b51080444feb799
trusted.glusterfs.dht=0x0000000000000000bf51072affffffff
trusted.glusterfs.dht.mds=0x00000000
trusted.glusterfs.mdata=0x010000000000000000000000005cefb41b000000002c07752e000000005cefb41b000000002c07752e000000005cefb41b000000002c07752e
trusted.glusterfs.quota.588a733c-9b13-4824-af86-0a3a037c2bc5.contri.1=0x000000000000000000000000000000000000000000000001
trusted.glusterfs.quota.size.1=0x000000000000000000000000000000000000000000000001

# file: gluster/brick3/nftvol-b3/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.7/level5.60
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.gfid=0x05184e8d5e4349dd9b51080444feb799
trusted.glusterfs.dht=0x0000000100000000000000003fc5ad0d
trusted.glusterfs.mdata=0x010000000000000000000000005cefb41b000000002c07752e000000005cefb41b000000002c07752e000000005cefb41b000000002c07752e
trusted.glusterfs.quota.588a733c-9b13-4824-af86-0a3a037c2bc5.contri.1=0x000000000000000000000000000000000000000000000001
trusted.glusterfs.quota.size.1=0x000000000000000000000000000000000000000000000001



n4:
[root@rhs-gp-srv10 ~]# getfattr -d -m . -e hex /gluster/brick*/nftvol*/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.7/level5.60
getfattr: Removing leading '/' from absolute path names
# file: gluster/brick1/nftvol-b2/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.7/level5.60
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.nftvol-client-3=0x000000000000000200000000
trusted.gfid=0x05184e8d5e4349dd9b51080444feb799
trusted.glusterfs.dht=0x0000000000000000bf51072affffffff
trusted.glusterfs.dht.mds=0x00000000
trusted.glusterfs.mdata=0x010000000000000000000000005cefb41b000000002c07752e000000005cefb41b000000002c07752e000000005cefb41b000000002c07752e
trusted.glusterfs.quota.588a733c-9b13-4824-af86-0a3a037c2bc5.contri.1=0x000000000000000000000000000000000000000000000001
trusted.glusterfs.quota.size.1=0x000000000000000000000000000000000000000000000001

# file: gluster/brick2/nftvol-b3/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.7/level5.60
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.gfid=0x05184e8d5e4349dd9b51080444feb799
trusted.glusterfs.dht=0x0000000100000000000000003fc5ad0d
trusted.glusterfs.mdata=0x010000000000000000000000005cefb41b000000002c07752e000000005cefb41b000000002c07752e000000005cefb41b000000002c07752e
trusted.glusterfs.quota.588a733c-9b13-4824-af86-0a3a037c2bc5.contri.1=0x000000000000000000000000000000000000000000000001
trusted.glusterfs.quota.size.1=0x000000000000000000000000000000000000000000000001

# file: gluster/brick3/nftvol-b4/IOs/samedir-creates/newiter/level1.1/level2.1/level3.1/level4.7/level5.60
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.nftvol-client-11=0x000000000000000000000000
trusted.gfid=0x05184e8d5e4349dd9b51080444feb799
trusted.glusterfs.dht=0x00000001000000003fc5ad0e7f8b5a1b
trusted.glusterfs.mdata=0x010000000000000000000000005cefb41b000000002c07752e000000005cefb41b000000002c07752e000000005cefb41b000000002c07752e
trusted.glusterfs.quota.588a733c-9b13-4824-af86-0a3a037c2bc5.contri.1=0x000000000000000000000000000000000000000000000001
trusted.glusterfs.quota.size.1=0x000000000000000000000000000000000000000000000001

Comment 4 Nag Pavan Chilakam 2019-05-30 12:34:55 UTC
from above it can be seen that

b3 is blaming b4,b5 while b4,b5 is blaming b3

Comment 21 errata-xmlrpc 2019-10-30 12:21:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:3249


Note You need to log in before you can comment on or make changes to this bug.