Description of problem: =============== Able to perform IO to the file which is in split-brain Version-Release number of selected component (if applicable): ========== glusterfs-server-3.7.5-12.el7rhgs.x86_64 How reproducible: Steps to Reproduce: ============= 1. Create 1x2 volume and attach 1x2 hot tier to the volume and mount it on client using nfs 2. From the mount create directory and create file 3. Bring down bricks (both hot and cold) from node1 and then append content to the file from the mount 4. Bring back down bricks using gluster vol start force command and then immediately bring down the bricks (both hot and cold) from other node (node2) At this time file got migrated from hot to cold 5. Bring back down bricks using gluster vol start force command and after this able to see actual file in both hot and cold tier in the node1 and in node2 link file on cold and actual file on hot tier 6. Run the gluster vol heal <vol> command and check the status of the heal using gluster vol heal <vol> info and shows file in split-brain but able to perform IO the file Expected results: ========= Should not allow IO to file when it is in split-brain Additional info: =========== [root@tettnang ~]# gluster vol info afr1x2_tier Volume Name: afr1x2_tier Type: Tier Volume ID: 5d6db910-948c-484e-9672-0011ba3b7a09 Status: Started Number of Bricks: 4 Transport-type: tcp Hot Tier : Hot Tier Type : Replicate Number of Bricks: 1 x 2 = 2 Brick1: rhs-client18.lab.eng.blr.redhat.com:/rhs/brick6/afr1x2_tier_hot Brick2: rhs-client19.lab.eng.blr.redhat.com:/rhs/brick6/afr1x2_tier_hot Cold Tier: Cold Tier Type : Replicate Number of Bricks: 1 x 2 = 2 Brick3: rhs-client19.lab.eng.blr.redhat.com:/rhs/brick7/afr1x2_tier_cold Brick4: rhs-client18.lab.eng.blr.redhat.com:/rhs/brick7/afr1x2_tier_cold Options Reconfigured: cluster.watermark-hi: 12 cluster.watermark-low: 10 performance.readdir-ahead: on features.ctr-enabled: on cluster.tier-mode: cache cluster.self-heal-daemon: on Getfattr information from 18 ================ [root@rhs-client18 split]# getfattr -d -m . -e hex /rhs/brick7/afr1x2_tier_cold/new/one.txt getfattr: Removing leading '/' from absolute path names # file: rhs/brick7/afr1x2_tier_cold/new/one.txt security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.afr1x2_tier-client-0=0x000000000000000000000000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005677f468000a1741 trusted.gfid=0xd9e9a2300e464abe810113910ed25f87 [root@rhs-client18 split]# getfattr -d -m . -e hex /rhs/brick6/afr1x2_tier_hot/new/one.txt getfattr: Removing leading '/' from absolute path names # file: rhs/brick6/afr1x2_tier_hot/new/one.txt security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.afr1x2_tier-client-2=0x000000010000000000000000 trusted.afr.dirty=0x000000030000000000000000 trusted.gfid=0xd9e9a2300e464abe810113910ed25f87 trusted.tier.tier-dht.linkto=0x6166723178325f746965722d636f6c642d64687400 Getfattr information form 19 ================== [root@rhs-client19 ~]# getfattr -d -m . -e hex /rhs/brick7/afr1x2_tier_cold/new/one.txt getfattr: Removing leading '/' from absolute path names # file: rhs/brick7/afr1x2_tier_cold/new/one.txt security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.bit-rot.version=0x02000000000000005677f4f6000325d0 trusted.gfid=0xd9e9a2300e464abe810113910ed25f87 [root@rhs-client19 ~]# getfattr -d -m . -e hex /rhs/brick6/afr1x2_tier_hot/new/one.txt getfattr: Removing leading '/' from absolute path names # file: rhs/brick6/afr1x2_tier_hot/new/one.txt security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.afr1x2_tier-client-3=0x000000010000000000000000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x03000000000000005677f3ad0007f386 trusted.gfid=0xd9e9a2300e464abe810113910ed25f87 trusted.tier.tier-dht.linkto=0x6166723178325f746965722d636f6c642d64687400 Mount ======= [root@vertigo new]# cat one.txt When all are up and running Brick from 18 down & file is on hot tier Brick from 19 down & file is on hot tier Brick from 19 down & file is on hot tier new one new
sosreport is available @ /home/repo/sosreports/bug.1293349 on rhsqe-repo.lab.eng.blr.redhat.com
Yes file on hot tier in split-brain state, On one node cold tier contains the actual file and another node contains link file [root@rhs-client18 ~]# gluster vol heal afr1x2_tier info split-brain Brick rhs-client18.lab.eng.blr.redhat.com:/rhs/brick6/afr1x2_tier_hot <gfid:d9e9a230-0e46-4abe-8101-13910ed25f87> Number of entries in split-brain: 1 Brick rhs-client19.lab.eng.blr.redhat.com:/rhs/brick6/afr1x2_tier_hot <gfid:d9e9a230-0e46-4abe-8101-13910ed25f87> Number of entries in split-brain: 1 Brick rhs-client19.lab.eng.blr.redhat.com:/rhs/brick7/afr1x2_tier_cold Number of entries in split-brain: 0 Brick rhs-client18.lab.eng.blr.redhat.com:/rhs/brick7/afr1x2_tier_cold Number of entries in split-brain: 0
gluster vol heal afr1x2_tier info split-brain shows file in split-brain and more over this file is not getting promoted In replica volume user expects both nodes should contain the same data in this case two nodes not having same data
Rajesh, I think we are on the same page. File that is shown in split-brain is a link file not the data file. And yes the file won't get promoted until the split-brain is resolved on hot-tier. I didn't understand "In replica volume user expects both nodes should contain the same data in this case two nodes not having same data" Are you saying the two bricks which are in replication don't have same data? Pranith
Earlier i was seeing differences (one brick contains actual file and another one contains link file) between two bricks which are in replication but now both bricks having same data
Once file is split-brain promotions will fail and is expected and here zero size files are in split brain so afr can ignore these files while checking for split-brain [root@rhs-client18 tier]# gluster vol heal afr1x2_tier info split-brain Brick rhs-client18.lab.eng.blr.redhat.com:/rhs/brick6/afr1x2_tier_hot <gfid:d9e9a230-0e46-4abe-8101-13910ed25f87> Number of entries in split-brain: 1 Brick rhs-client19.lab.eng.blr.redhat.com:/rhs/brick6/afr1x2_tier_hot <gfid:d9e9a230-0e46-4abe-8101-13910ed25f87> Number of entries in split-brain: 1 Brick rhs-client19.lab.eng.blr.redhat.com:/rhs/brick7/afr1x2_tier_cold Number of entries in split-brain: 0 Brick rhs-client18.lab.eng.blr.redhat.com:/rhs/brick7/afr1x2_tier_cold Number of entries in split-brain: 0 [root@rhs-client18 tier]# cd /rhs/brick6/afr1x2_tier_hot [root@rhs-client18 afr1x2_tier_hot]# ls big new split test [root@rhs-client18 afr1x2_tier_hot]# cd new/ [root@rhs-client18 new]# ls -lrth total 4.0K ---------T. 2 root root 0 Dec 21 18:18 one.txt [root@rhs-client18 new]# getfattr -d -m . -e hex one.txt # file: one.txt security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.afr1x2_tier-client-2=0x000000010000000000000000 trusted.afr.dirty=0x000000030000000000000000 trusted.gfid=0xd9e9a2300e464abe810113910ed25f87 trusted.tier.tier-dht.linkto=0x6166723178325f746965722d636f6c642d64687400 [root@rhs-client19 ~]# cd /rhs/brick6/afr1x2_tier_hot/new [root@rhs-client19 new]# getfattr -d -m . -e hex one.txt # file: one.txt security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.afr1x2_tier-client-3=0x000000010000000000000000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x03000000000000005677f3ad0007f386 trusted.gfid=0xd9e9a2300e464abe810113910ed25f87 trusted.tier.tier-dht.linkto=0x6166723178325f746965722d636f6c642d64687400
We can take that as enhancement, where if the files are in data split-brain and both the files have zero size, it will remove split-brain automatically. Could you change the bug description to reflect the same? Pranith
Upstream patch https://review.gluster.org/#/c/18283
(In reply to Ravishankar N from comment #14) > Upstream patch https://review.gluster.org/#/c/18283 There is also a follow-up patch: https://review.gluster.org/#/c/18391/ (so 2 patches in total for this bug). Note that the fixes were sent as part of fixing BZ 1482812
Update: ========= Build Used: glusterfs-3.12.2-7.el7rhgs.x86_64 Scenario 1 : 1) create 1 * 2 volume and start 2) disable self-heal-daemon 3) write file ( file1 ) with some content 4) kill b0 5) truncate the file to 0 6) bring b0 up 7) kill b1 8) truncate file to 0 9) bring b1 up 10) enable self-heal-daemon 11) check heal info 12) read file from client > After enabling self-heal daemon, below is the heal info # gluster vol heal 12 info Brick 10.70.35.61:/bricks/brick1/b0 /file1 /file_sb - Is in split-brain Status: Connected Number of entries: 2 Brick 10.70.35.174:/bricks/brick1/b1 <gfid:9362fdc1-26fa-47b7-b4e6-e066679fbf35> <gfid:2d343643-19c0-4dda-b37f-1e995a3d1c9d> - Is in split-brain Status: Connected Number of entries: 2 # From node 1 : # getfattr -d -m . -e hex /bricks/brick1/b0/file1 getfattr: Removing leading '/' from absolute path names # file: bricks/brick1/b0/file1 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.12-client-1=0x000000010000000000000000 trusted.afr.dirty=0x000000000000000000000000 trusted.gfid=0x9362fdc126fa47b7b4e6e066679fbf35 trusted.gfid2path.d16e15bafe6e4256=0x30303030303030302d303030302d303030302d303030302d3030303030303030303030312f66696c6531 From Node 2: # getfattr -d -m . -e hex /bricks/brick1/b1/file1 getfattr: Removing leading '/' from absolute path names # file: bricks/brick1/b1/file1 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.12-client-0=0x000000010000000000000000 trusted.afr.dirty=0x000000000000000000000000 trusted.gfid=0x9362fdc126fa47b7b4e6e066679fbf35 trusted.gfid2path.d16e15bafe6e4256=0x30303030303030302d303030302d303030302d303030302d3030303030303030303030312f66696c6531 # If we observe above for file1, even though attr are blaming each other bricks, we are not seen that file as split-brain in heal info which is expected. and the file1 was healed after few minutes. Scenario 2: validated with meta-data split-brain having same meta-data on all the bricks. ( file name : file_meta1 ) > After enabling self-heal daemon, below is the heal info # getfattr -d -m . -e hex /bricks/brick1/b0/file_meta1 getfattr: Removing leading '/' from absolute path names # file: bricks/brick1/b0/file_meta1 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.12-client-1=0x000000000000000100000000 trusted.afr.dirty=0x000000000000000000000000 trusted.gfid=0xd9481da807e34544bb389bf1763b4d91 trusted.gfid2path.76538e835da1a595=0x30303030303030302d303030302d303030302d303030302d3030303030303030303030312f66696c655f6d65746131 # # getfattr -d -m . -e hex /bricks/brick1/b1/file_meta1 getfattr: Removing leading '/' from absolute path names # file: bricks/brick1/b1/file_meta1 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.12-client-0=0x000000000000000100000000 trusted.afr.dirty=0x000000000000000000000000 trusted.gfid=0xd9481da807e34544bb389bf1763b4d91 trusted.gfid2path.76538e835da1a595=0x30303030303030302d303030302d303030302d303030302d3030303030303030303030312f66696c655f6d65746131 # # date;gluster vol heal 12 info Fri Apr 20 04:56:29 EDT 2018 Brick 10.70.35.61:/bricks/brick1/b0 /file_meta1 /file_meta2 - Is in split-brain Status: Connected Number of entries: 2 Brick 10.70.35.174:/bricks/brick1/b1 <gfid:d9481da8-07e3-4544-bb38-9bf1763b4d91> <gfid:86433c76-d1ed-4be2-a7dd-a80d0ab3e80e> - Is in split-brain Status: Connected Number of entries: 2 # > healed file_meta1 after few min [root@dhcp35-163 ~]# date;gluster vol heal 12 info Fri Apr 20 05:48:20 EDT 2018 Brick 10.70.35.61:/bricks/brick1/b0 /file_meta2 - Is in split-brain Status: Connected Number of entries: 1 Brick 10.70.35.174:/bricks/brick1/b1 /file_meta2 - Is in split-brain Status: Connected Number of entries: 1 [root@dhcp35-163 ~]# Changing status to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2607