Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1384983 - split-brain observed with arbiter & replica 3 volume.
split-brain observed with arbiter & replica 3 volume.
Status: CLOSED ERRATA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: arbiter (Show other bugs)
3.2
x86_64 Linux
high Severity high
: ---
: RHGS 3.4.0
Assigned To: Ravishankar N
SATHEESARAN
:
Depends On: 1506140 1539358 1541458 1542380 1542382 1597120 1597123
Blocks: 1503134
  Show dependency treegraph
 
Reported: 2016-10-14 09:43 EDT by RamaKasturi
Modified: 2018-09-18 06:29 EDT (History)
13 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-09-04 02:29:44 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2018:2607 None None None 2018-09-04 02:31 EDT

  None (edit)
Description RamaKasturi 2016-10-14 09:43:54 EDT
Description of problem:
Did a gluster volume force restart to start one of the brick and it had no impact on the brick which was stopped. Stopped all my volumes and rebooted my nodes and i see that files in engine volume are in split-brain.

Version-Release number of selected component (if applicable):
glusterfs-3.8.4-2.el7rhgs.x86_64

How reproducible:
seen it once

Steps to Reproduce:
1. 
2.
3.

Actual results:
files in engine volume are in split-brain state

Expected results:
files should not be in split-brain.

Additional info:
Comment 2 RamaKasturi 2016-10-14 09:47:18 EDT
Files in split-brain
===================================================
root@rhsqa-grafton2 ~]# gluster volume heal engine info split-brain
Brick 10.70.36.79:/rhgs/brick1/engine
/__DIRECT_IO_TEST__
Status: Connected
Number of entries in split-brain: 1

Brick 10.70.36.80:/rhgs/brick1/engine
/53c84f1e-3643-45aa-805e-8c9e92ee3098/ha_agent
/__DIRECT_IO_TEST__
Status: Connected
Number of entries in split-brain: 2

Brick 10.70.36.81:/rhgs/brick1/engine
/__DIRECT_IO_TEST__
/53c84f1e-3643-45aa-805e-8c9e92ee3098/ha_agent
Status: Connected
Number of entries in split-brain: 2

getfattrs on the file which are in split-brain
==================================================
[root@rhsqa-grafton1 ~]# getfattr -d -m . -e hex /rhgs/brick1/engine/__DIRECT_IO_TEST__
getfattr: Removing leading '/' from absolute path names
# file: rhgs/brick1/engine/__DIRECT_IO_TEST__
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.engine-client-1=0x0000000000000b5e00000000
trusted.afr.engine-client-2=0x000000000000000000000000
trusted.gfid=0x9202d90daed441a69b7538d4d6eae1b1
trusted.glusterfs.shard.block-size=0x0000000020000000
trusted.glusterfs.shard.file-size=0x0000000000000000000000000000000000000000000000000000000000000000

[root@rhsqa-grafton2 ~]# getfattr -d -m . -e hex /rhgs/brick1/engine/53c84f1e-3643-45aa-805e-8c9e92ee3098/ha_agent
getfattr: Removing leading '/' from absolute path names
# file: rhgs/brick1/engine/53c84f1e-3643-45aa-805e-8c9e92ee3098/ha_agent
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x000000000000000000000001
trusted.afr.engine-client-0=0x0000000500000000000000e5
trusted.gfid=0x4da13f61cc0b4d46ae303f2676866f06
trusted.glusterfs.dht=0x000000010000000000000000ffffffff

[root@rhsqa-grafton2 ~]# getfattr -d -m . -e hex /rhgs/brick1/engine/__DIRECT_IO_TEST__
getfattr: Removing leading '/' from absolute path names
# file: rhgs/brick1/engine/__DIRECT_IO_TEST__
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.engine-client-0=0x000000000000000200000000
trusted.afr.engine-client-2=0x000000000000000100000000
trusted.gfid=0x9202d90daed441a69b7538d4d6eae1b1
trusted.glusterfs.shard.block-size=0x0000000020000000
trusted.glusterfs.shard.file-size=0x0000000000000000000000000000000000000000000000000000000000000000

[root@rhsqa-grafton3 ~]# getfattr -d -m . -e hex /rhgs/brick1/engine/53c84f1e-3643-45aa-805e-8c9e92ee3098/ha_agent
getfattr: Removing leading '/' from absolute path names
# file: rhgs/brick1/engine/53c84f1e-3643-45aa-805e-8c9e92ee3098/ha_agent
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x0000000000000000000014f3
trusted.afr.engine-client-0=0x0000000500000000000000b1
trusted.afr.engine-client-1=0x000000000000000000000000
trusted.gfid=0x4da13f61cc0b4d46ae303f2676866f06
trusted.glusterfs.dht=0x000000010000000000000000ffffffff

[root@rhsqa-grafton3 ~]# getfattr -d -m . -e hex /rhgs/brick1/engine/__DIRECT_IO_TEST__
getfattr: Removing leading '/' from absolute path names
# file: rhgs/brick1/engine/__DIRECT_IO_TEST__
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.engine-client-0=0x000000000000000000000000
trusted.afr.engine-client-1=0x0000000000000b5e00000000
trusted.gfid=0x9202d90daed441a69b7538d4d6eae1b1
trusted.glusterfs.shard.block-size=0x0000000020000000
trusted.glusterfs.shard.file-size=0x0000000000000000000000000000000000000000000000000000000000000000
Comment 3 RamaKasturi 2016-10-14 09:48:43 EDT
sosreports can be found in the link below:
==================================================

http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/HC/split_brain/
Comment 4 RamaKasturi 2016-10-14 09:55:50 EDT
volume info for engine vol:
==========================
[root@rhsqa-grafton1 ~]# gluster volume info engine
 
Volume Name: engine
Type: Replicate
Volume ID: 03c68517-4be1-45e3-b788-87e10d73f3ee
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: 10.70.36.79:/rhgs/brick1/engine
Brick2: 10.70.36.80:/rhgs/brick1/engine
Brick3: 10.70.36.81:/rhgs/brick1/engine (arbiter)
Options Reconfigured:
server.ssl: on
client.ssl: on
auth.ssl-allow: 10.70.36.79,10.70.36.80,10.70.36.81
performance.strict-o-direct: on
user.cifs: off
network.ping-timeout: 30
cluster.shd-max-threads: 8
cluster.shd-wait-qlength: 10000
cluster.locking-scheme: granular
cluster.data-self-heal-algorithm: full
performance.low-prio-threads: 32
features.shard-block-size: 512MB
features.shard: on
storage.owner-gid: 36
storage.owner-uid: 36
cluster.server-quorum-type: server
cluster.quorum-type: auto
network.remote-dio: off
cluster.eager-lock: enable
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on
cluster.granular-entry-heal: on
Comment 8 SATHEESARAN 2017-06-16 03:37:33 EDT
Resetting all the acks, which was previously present on this bug
Comment 9 SATHEESARAN 2017-06-16 03:40:03 EDT
I could consistenly hit issue with arbiter volume with RHGS 3.3.0 ( interim build )
- glusterfs-3.8.4-28.el7rhgs
Comment 10 SATHEESARAN 2017-06-16 03:56:15 EDT
Tested with RHGS 3.3.0 interim build ( glusterfs-3.8.4-28.el7rhgs ) and I could hit this issue consistenly with the other issue of arbiter becoming source of heal BZ 1401969

Very simple test is to:
1. Create arbiter volume 1x (2+1) with bricks - brick1, brick2, arbiter
2. Fuse mount it on any RHEL 7 client
3. Run some app ( dd, truncate, etc, ) on a single file
4. Kill brick2
5. sleep for 3 seconds
6. Bring up brick2, sleep for 3 seconds, kill arbiter
7. sleep for 3 seconds
8. Bring up arbiter, sleep for 3 seconds, kill brick1
9. sleep for 3 seconds
10. continue with step 4

When the above steps are repeated, I observed that I landed up in a split-brain or arbiter becoming source of heal ( bz 1401969 ).
Comment 19 SATHEESARAN 2017-07-13 01:59:49 EDT
The additional information that I have is, I am able to hit the split-brain issue with replica 3 volume as well.

Here are the steps to reproduce.

Setup details
--------------
1. 3 Node Gluster cluster ( node1, node2, node3 )
2. Create replica 3 volume
3. Mount it on node1

Steps
------

There are 2 scripts run in parallel to reproduce this issue.

Script1 kills then starts bricks, in cyclic fashion across all the bricks, such a way that there are always 2 bricks alive at one instance.

while true; do 
    kill node2-brick2
    kill node2-glusterd
    sleep 3
    start node2-glusterd # This also starts the brick on this node
    sleep 1

    kill node3-brick3
    kill node3-glusterd
    sleep 3
    start node3-glusterd # This also starts the brick on this node
    sleep 1

    kill node1-brick1
    kill node1-glusterd
    sleep 3
    start node1-glusterd # This also starts the brick on this node
    sleep 1
done

script2 does I/O on the fuse mount, while the bricks are killed & started by scrip1

MOUNTPATH=/mnt/test
while true; do  
    echo "dd if=/dev/urandom of=$MOUNTPATH/FILE bs=128k count=10" >> /var/log/glusterfs/mnt-test.log
    dd if=/dev/urandom of=$MOUNTPATH/FILE bs=128k count=10
    echo "truncate $MOUNTPATH/FILE --size 5K" >> /var/log/glusterfs/mnt-test.log
    truncate $MOUNTPATH/FILE --size 5K
    echo "cat /home/template > $MOUNTPATH/FILE" >> /var/log/glusterfs/mnt-test.log
    cat /home/template > $MOUNTPATH/FILE
    echo "truncate $MOUNTPATH/FILE --size 100k" >> /var/log/glusterfs/mnt-test.log
    truncate $MOUNTPATH/FILE --size 100k
done


When I ran the above script with replica 3, I could still see the file on fuse mount in a split-brain state
Comment 20 SATHEESARAN 2017-07-13 02:19:30 EDT
I could hit with this split-brain issue with the scripts as described in comment19 very consistently
Comment 21 SATHEESARAN 2017-07-13 03:57:24 EDT
Tested with the Ravi's fix and also with cluster.eager-lock=off
But still I could land up in split-brain scenario

Here are the changelogs from all the bricks:

Brick1
-------
# getfattr -d -m. -ehex /gluster/brick1/b1/FILE 
getfattr: Removing leading '/' from absolute path names
# file: gluster/brick1/b1/FILE
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.arbvol-client-1=0x00006ba80000000000000000
trusted.afr.arbvol-client-2=0x000000010000000000000000
trusted.afr.dirty=0x000000000000000000000000
trusted.gfid=0xbadce32eff854b928546c7fff5a63b30

Brick2
-------
# getfattr -d -m. -ehex /gluster/brick1/b1/FILE
getfattr: Removing leading '/' from absolute path names
# file: gluster/brick1/b1/FILE
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.arbvol-client-0=0x000000010000000000000000
trusted.afr.arbvol-client-2=0x000036070000000000000000
trusted.afr.dirty=0x000005c70000000000000000
trusted.gfid=0xbadce32eff854b928546c7fff5a63b30

Brick3 ( arbiter )
-------------------
# getfattr -d -m. -ehex /gluster/brick1/b1/FILE
getfattr: Removing leading '/' from absolute path names
# file: gluster/brick1/b1/FILE
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.arbvol-client-1=0x00015e040000000000000000
trusted.afr.dirty=0x000000010000000000000000
trusted.gfid=0xbadce32eff854b928546c7fff5a63b30
Comment 55 SATHEESARAN 2018-08-23 15:25:32 EDT
Tested with RHGS 3.4.0 nightly build - glusterfs-3.12.2-16.el7rhgs with the steps in comment42. No issues found.
Comment 57 errata-xmlrpc 2018-09-04 02:29:44 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2607

Note You need to log in before you can comment on or make changes to this bug.