Bug 1759875 - afr: support split-brain CLI for replica 3
Summary: afr: support split-brain CLI for replica 3
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: replicate
Version: rhgs-3.5
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: RHGS 3.5.z Batch Update 3
Assignee: Ravishankar N
QA Contact: Arthy Loganathan
URL:
Whiteboard:
: 1901154 (view as bug list)
Depends On: 1756938 1760791 1760792
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-10-09 10:02 UTC by Ravishankar N
Modified: 2024-03-25 15:27 UTC (History)
11 users (show)

Fixed In Version: glusterfs-6.0-38
Doc Type: Enhancement
Doc Text:
This enhancement provides CLI based split-brain resolution for replica 3 which is an advantage for the storage administrators. With this update, you can resolve split-brain via CLI for replica 3 volumes which earlier were available only for 2 replica volumes.
Clone Of: 1756938
Environment:
Last Closed: 2020-12-17 04:50:17 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:5603 0 None None None 2020-12-17 04:50:33 UTC

Description Ravishankar N 2019-10-09 10:02:57 UTC
+++ This bug was initially created as a clone of Bug #1756938 +++

Description of problem:

http://post-office.corp.redhat.com/archives/gluster-tech-list/2019-September/msg00137.html

I want to propose this for rhgs-3.5.1 as it can be really valuable for GSS to be able to use the existing CLI commands to fix corner split-brain cases even in replica 3.

--------------------------------------------------------------------------------
Ever since we added quorum checks for lookups in afr via commit
bd44d59741bb8c0f5d7a62c5b1094179dd0ce8a4, the split-brain resolution
commands would not work for replica 3 because there would be no
readables for the lookup fop.

The argument was that split-brains do not occur in replica 3 but we do
see (data/metadata) split-brain cases once in a while which indicate that  there are   a few bugs/corner cases yet to be discovered and fixed.

Fortunately, commit  8016d51a3bbd410b0b927ed66be50a09574b7982 added
GF_CLIENT_PID_GLFS_HEALD as the pid for all fops made by glfsheal. If we
leverage this and allow lookups when pid is GF_CLIENT_PID_GLFS_HEALD,
split-brain resolution commands will work for replica 3 volumes too.

Attempting a patch which does this.

--------------------------------------------------------------------------------

Comment 8 Arthy Loganathan 2020-11-05 07:12:47 UTC
[root@dhcp47-141 ~]# gluster vol info
 
Volume Name: testvol_replicated
Type: Replicate
Volume ID: 5a182a14-bf47-42c1-809a-973bf1e133cc
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.70.47.141:/bricks/brick0/testvol_replicated_brick0
Brick2: 10.70.47.41:/bricks/brick0/testvol_replicated_brick1
Brick3: 10.70.47.178:/bricks/brick0/testvol_replicated_brick2
Options Reconfigured:
cluster.quorum-type: none
cluster.self-heal-daemon: off
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off


========================================

[root@dhcp47-141 ~]# gluster volume heal testvol_replicated split-brain source-brick 10.70.47.141:/bricks/brick0/testvol_replicated_brick0 /file1
Healed /file1.


[root@dhcp47-141 ~]# gluster volume heal testvol_replicated info split-brain
Brick 10.70.47.141:/bricks/brick0/testvol_replicated_brick0
/file2
<gfid:cbd90af1-ff06-4cbf-9e03-369b905c1fa8>
<gfid:814d5f4a-1ba9-42b6-91a5-dc5c2e8a27b0>
<gfid:17dc170d-5eb3-480f-bcac-59e937317066>
Status: Connected
Number of entries in split-brain: 4

Brick 10.70.47.41:/bricks/brick0/testvol_replicated_brick1
/file2
/file4
/file5
/dir
Status: Connected
Number of entries in split-brain: 4

Brick 10.70.47.178:/bricks/brick0/testvol_replicated_brick2
/file2
<gfid:cbd90af1-ff06-4cbf-9e03-369b905c1fa8>
<gfid:814d5f4a-1ba9-42b6-91a5-dc5c2e8a27b0>
<gfid:17dc170d-5eb3-480f-bcac-59e937317066>
Status: Connected
Number of entries in split-brain: 4

[root@dhcp47-141 ~]# gluster volume heal testvol_replicated split-brain source-brick 10.70.47.141:/bricks/brick0/testvol_replicated_brick0 gfid:cbd90af1-ff06-4cbf-9e03-369b905c1fa8
Healed gfid:cbd90af1-ff06-4cbf-9e03-369b905c1fa8.

Verified the fix in,
glusterfs-server-6.0-46.el8rhgs.x86_64
glusterfs-server-6.0-46.el7rhgs.x86_64

Comment 12 errata-xmlrpc 2020-12-17 04:50:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (glusterfs bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:5603

Comment 13 Karthik U S 2020-12-18 05:14:25 UTC
*** Bug 1901154 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.