Bug 1384983
Summary: | split-brain observed with arbiter & replica 3 volume. | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | RamaKasturi <knarra> |
Component: | arbiter | Assignee: | Ravishankar N <ravishankar> |
Status: | CLOSED ERRATA | QA Contact: | SATHEESARAN <sasundar> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | rhgs-3.2 | CC: | amukherj, bkunal, ksubrahm, pkarampu, psony, ravishankar, rcyriac, rhinduja, rhs-bugs, sasundar, ssaha, storage-qa-internal, vavuthu |
Target Milestone: | --- | ||
Target Release: | RHGS 3.4.0 | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-09-04 06:29:44 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1506140, 1539358, 1541458, 1542380, 1542382, 1597120, 1597123 | ||
Bug Blocks: | 1503134 |
Description
RamaKasturi
2016-10-14 13:43:54 UTC
Files in split-brain =================================================== root@rhsqa-grafton2 ~]# gluster volume heal engine info split-brain Brick 10.70.36.79:/rhgs/brick1/engine /__DIRECT_IO_TEST__ Status: Connected Number of entries in split-brain: 1 Brick 10.70.36.80:/rhgs/brick1/engine /53c84f1e-3643-45aa-805e-8c9e92ee3098/ha_agent /__DIRECT_IO_TEST__ Status: Connected Number of entries in split-brain: 2 Brick 10.70.36.81:/rhgs/brick1/engine /__DIRECT_IO_TEST__ /53c84f1e-3643-45aa-805e-8c9e92ee3098/ha_agent Status: Connected Number of entries in split-brain: 2 getfattrs on the file which are in split-brain ================================================== [root@rhsqa-grafton1 ~]# getfattr -d -m . -e hex /rhgs/brick1/engine/__DIRECT_IO_TEST__ getfattr: Removing leading '/' from absolute path names # file: rhgs/brick1/engine/__DIRECT_IO_TEST__ security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.engine-client-1=0x0000000000000b5e00000000 trusted.afr.engine-client-2=0x000000000000000000000000 trusted.gfid=0x9202d90daed441a69b7538d4d6eae1b1 trusted.glusterfs.shard.block-size=0x0000000020000000 trusted.glusterfs.shard.file-size=0x0000000000000000000000000000000000000000000000000000000000000000 [root@rhsqa-grafton2 ~]# getfattr -d -m . -e hex /rhgs/brick1/engine/53c84f1e-3643-45aa-805e-8c9e92ee3098/ha_agent getfattr: Removing leading '/' from absolute path names # file: rhgs/brick1/engine/53c84f1e-3643-45aa-805e-8c9e92ee3098/ha_agent security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000001 trusted.afr.engine-client-0=0x0000000500000000000000e5 trusted.gfid=0x4da13f61cc0b4d46ae303f2676866f06 trusted.glusterfs.dht=0x000000010000000000000000ffffffff [root@rhsqa-grafton2 ~]# getfattr -d -m . -e hex /rhgs/brick1/engine/__DIRECT_IO_TEST__ getfattr: Removing leading '/' from absolute path names # file: rhgs/brick1/engine/__DIRECT_IO_TEST__ security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.engine-client-0=0x000000000000000200000000 trusted.afr.engine-client-2=0x000000000000000100000000 trusted.gfid=0x9202d90daed441a69b7538d4d6eae1b1 trusted.glusterfs.shard.block-size=0x0000000020000000 trusted.glusterfs.shard.file-size=0x0000000000000000000000000000000000000000000000000000000000000000 [root@rhsqa-grafton3 ~]# getfattr -d -m . -e hex /rhgs/brick1/engine/53c84f1e-3643-45aa-805e-8c9e92ee3098/ha_agent getfattr: Removing leading '/' from absolute path names # file: rhgs/brick1/engine/53c84f1e-3643-45aa-805e-8c9e92ee3098/ha_agent security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x0000000000000000000014f3 trusted.afr.engine-client-0=0x0000000500000000000000b1 trusted.afr.engine-client-1=0x000000000000000000000000 trusted.gfid=0x4da13f61cc0b4d46ae303f2676866f06 trusted.glusterfs.dht=0x000000010000000000000000ffffffff [root@rhsqa-grafton3 ~]# getfattr -d -m . -e hex /rhgs/brick1/engine/__DIRECT_IO_TEST__ getfattr: Removing leading '/' from absolute path names # file: rhgs/brick1/engine/__DIRECT_IO_TEST__ security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.engine-client-0=0x000000000000000000000000 trusted.afr.engine-client-1=0x0000000000000b5e00000000 trusted.gfid=0x9202d90daed441a69b7538d4d6eae1b1 trusted.glusterfs.shard.block-size=0x0000000020000000 trusted.glusterfs.shard.file-size=0x0000000000000000000000000000000000000000000000000000000000000000 sosreports can be found in the link below: ================================================== http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/HC/split_brain/ volume info for engine vol: ========================== [root@rhsqa-grafton1 ~]# gluster volume info engine Volume Name: engine Type: Replicate Volume ID: 03c68517-4be1-45e3-b788-87e10d73f3ee Status: Started Snapshot Count: 0 Number of Bricks: 1 x (2 + 1) = 3 Transport-type: tcp Bricks: Brick1: 10.70.36.79:/rhgs/brick1/engine Brick2: 10.70.36.80:/rhgs/brick1/engine Brick3: 10.70.36.81:/rhgs/brick1/engine (arbiter) Options Reconfigured: server.ssl: on client.ssl: on auth.ssl-allow: 10.70.36.79,10.70.36.80,10.70.36.81 performance.strict-o-direct: on user.cifs: off network.ping-timeout: 30 cluster.shd-max-threads: 8 cluster.shd-wait-qlength: 10000 cluster.locking-scheme: granular cluster.data-self-heal-algorithm: full performance.low-prio-threads: 32 features.shard-block-size: 512MB features.shard: on storage.owner-gid: 36 storage.owner-uid: 36 cluster.server-quorum-type: server cluster.quorum-type: auto network.remote-dio: off cluster.eager-lock: enable performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off transport.address-family: inet performance.readdir-ahead: on nfs.disable: on cluster.granular-entry-heal: on Resetting all the acks, which was previously present on this bug I could consistenly hit issue with arbiter volume with RHGS 3.3.0 ( interim build ) - glusterfs-3.8.4-28.el7rhgs Tested with RHGS 3.3.0 interim build ( glusterfs-3.8.4-28.el7rhgs ) and I could hit this issue consistenly with the other issue of arbiter becoming source of heal BZ 1401969 Very simple test is to: 1. Create arbiter volume 1x (2+1) with bricks - brick1, brick2, arbiter 2. Fuse mount it on any RHEL 7 client 3. Run some app ( dd, truncate, etc, ) on a single file 4. Kill brick2 5. sleep for 3 seconds 6. Bring up brick2, sleep for 3 seconds, kill arbiter 7. sleep for 3 seconds 8. Bring up arbiter, sleep for 3 seconds, kill brick1 9. sleep for 3 seconds 10. continue with step 4 When the above steps are repeated, I observed that I landed up in a split-brain or arbiter becoming source of heal ( bz 1401969 ). The additional information that I have is, I am able to hit the split-brain issue with replica 3 volume as well. Here are the steps to reproduce. Setup details -------------- 1. 3 Node Gluster cluster ( node1, node2, node3 ) 2. Create replica 3 volume 3. Mount it on node1 Steps ------ There are 2 scripts run in parallel to reproduce this issue. Script1 kills then starts bricks, in cyclic fashion across all the bricks, such a way that there are always 2 bricks alive at one instance. while true; do kill node2-brick2 kill node2-glusterd sleep 3 start node2-glusterd # This also starts the brick on this node sleep 1 kill node3-brick3 kill node3-glusterd sleep 3 start node3-glusterd # This also starts the brick on this node sleep 1 kill node1-brick1 kill node1-glusterd sleep 3 start node1-glusterd # This also starts the brick on this node sleep 1 done script2 does I/O on the fuse mount, while the bricks are killed & started by scrip1 MOUNTPATH=/mnt/test while true; do echo "dd if=/dev/urandom of=$MOUNTPATH/FILE bs=128k count=10" >> /var/log/glusterfs/mnt-test.log dd if=/dev/urandom of=$MOUNTPATH/FILE bs=128k count=10 echo "truncate $MOUNTPATH/FILE --size 5K" >> /var/log/glusterfs/mnt-test.log truncate $MOUNTPATH/FILE --size 5K echo "cat /home/template > $MOUNTPATH/FILE" >> /var/log/glusterfs/mnt-test.log cat /home/template > $MOUNTPATH/FILE echo "truncate $MOUNTPATH/FILE --size 100k" >> /var/log/glusterfs/mnt-test.log truncate $MOUNTPATH/FILE --size 100k done When I ran the above script with replica 3, I could still see the file on fuse mount in a split-brain state I could hit with this split-brain issue with the scripts as described in comment19 very consistently Tested with the Ravi's fix and also with cluster.eager-lock=off But still I could land up in split-brain scenario Here are the changelogs from all the bricks: Brick1 ------- # getfattr -d -m. -ehex /gluster/brick1/b1/FILE getfattr: Removing leading '/' from absolute path names # file: gluster/brick1/b1/FILE security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.arbvol-client-1=0x00006ba80000000000000000 trusted.afr.arbvol-client-2=0x000000010000000000000000 trusted.afr.dirty=0x000000000000000000000000 trusted.gfid=0xbadce32eff854b928546c7fff5a63b30 Brick2 ------- # getfattr -d -m. -ehex /gluster/brick1/b1/FILE getfattr: Removing leading '/' from absolute path names # file: gluster/brick1/b1/FILE security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.arbvol-client-0=0x000000010000000000000000 trusted.afr.arbvol-client-2=0x000036070000000000000000 trusted.afr.dirty=0x000005c70000000000000000 trusted.gfid=0xbadce32eff854b928546c7fff5a63b30 Brick3 ( arbiter ) ------------------- # getfattr -d -m. -ehex /gluster/brick1/b1/FILE getfattr: Removing leading '/' from absolute path names # file: gluster/brick1/b1/FILE security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.arbvol-client-1=0x00015e040000000000000000 trusted.afr.dirty=0x000000010000000000000000 trusted.gfid=0xbadce32eff854b928546c7fff5a63b30 Tested with RHGS 3.4.0 nightly build - glusterfs-3.12.2-16.el7rhgs with the steps in comment42. No issues found. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2607 |