Bug 1610743

Summary: Directory is incorrectly reported as in split-brain when dirty marking is there
Product: Red Hat Gluster Storage Reporter: Vijay Avuthu <vavuthu>
Component: replicateAssignee: Ravishankar N <ravishankar>
Status: CLOSED ERRATA QA Contact: Vijay Avuthu <vavuthu>
Severity: medium Docs Contact:
Priority: unspecified    
Version: rhgs-3.4CC: anepatel, apaladug, chpai, ravishankar, rhs-bugs, sanandpa, sankarshan, sheggodu, storage-qa-internal, vdas
Target Milestone: ---Keywords: ZStream
Target Release: RHGS 3.4.z Batch Update 1   
Hardware: Unspecified   
OS: Unspecified   
Fixed In Version: glusterfs-3.12.2-20 Doc Type: Bug Fix
Doc Text:
Previously, when directories had dirty markers set on them due to afr transaction failures or when replace brick/reset brick was performed, heal-info reporting considered them to be in split-brain state. With this fix, heal-info does not consider the presence of dirty markers as an indication of split-brain and does not display these entries to be in split-brain state.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-31 08:46:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Description Flags
gluster-health-report none

Description Vijay Avuthu 2018-08-01 11:19:35 UTC
Created attachment 1472059 [details]

Description of problem:

split-brain observed on parent dir while verifying  bug 1566336

Version-Release number of selected component (if applicable):

Build used: glusterfs-3.12.2-15.el7rhgs.x86_64

How reproducible: Always

Steps to Reproduce:

1) create 1 * 3 volume and start
2) Disable all client side heals and create dir from client
3) Fill the 2 bricks from back-end ( b1 and b2 )
4) From mount point, create the file inside dir and it should fail with "No Space" but the name entry is created on b0.
5) check the heal info and it should list the above file
6) check the change logs of dir ( parent ) and dirty bit should be set.
7) make space in b1 and b2 by removing previously created files from backend
8) trigger heal and the file which was created in step 4 should be healed.
9) dirty bit should be cleared from dir.

Actual results:

At step 5, observed split-brain on parent dir

Expected results:

parent dir shouldn't be in split-brain

Additional info:

# gluster vol heal 13 info 
Brick rhsauto025.lab.eng.blr.redhat.com:/bricks/brick0/b0
/test - Is in split-brain

Status: Connected
Number of entries: 2

Brick rhsauto024.lab.eng.blr.redhat.com:/bricks/brick0/b1
Status: Connected
Number of entries: 0

Brick rhsauto026.lab.eng.blr.redhat.com:/bricks/brick0/b2
Status: Connected
Number of entries: 0

# getfattr -d -m . -e hex /bricks/brick0/b0/test/
getfattr: Removing leading '/' from absolute path names
# file: bricks/brick0/b0/test/

# getfattr -d -m . -e hex /bricks/brick0/b0/test/test1 
getfattr: Removing leading '/' from absolute path names
# file: bricks/brick0/b0/test/test1

# gluster vol info 13
Volume Name: 13
Type: Replicate
Volume ID: 620301ee-9a31-4320-85cd-1beedcd93cdf
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Brick1: rhsauto025.lab.eng.blr.redhat.com:/bricks/brick0/b0
Brick2: rhsauto024.lab.eng.blr.redhat.com:/bricks/brick0/b1
Brick3: rhsauto026.lab.eng.blr.redhat.com:/bricks/brick0/b2
Options Reconfigured:
cluster.entry-self-heal: off
cluster.metadata-self-heal: off
cluster.data-self-heal: off
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

SOS Report:


Comment 5 Ravishankar N 2018-09-10 09:21:20 UTC
Attempted an upstream fix via https://review.gluster.org/21135  (BZ 1626994).

Comment 10 Anees Patel 2018-10-05 12:57:27 UTC
Verified the fix, see below.

Build used glusterfs-3.12.2-21.el7rhgs.x86_64

At Step 5 from Bug Description, No split-brain is reported by heal info

# gluster vol heal replicate_bug info
Status: Connected
Number of entries: 0

Status: Connected
Number of entries: 2

Status: Connected
Number of entries: 0

Also we can see dirty bit at step 6, which is as expected.

# getfattr -d -m . -e hex /bricks/brick3/day4/dir1
getfattr: Removing leading '/' from absolute path names
# file: bricks/brick3/day4/dir1

getfattr -d -m . -e hex /bricks/brick3/day4/dir1/300mbfile 
getfattr: Removing leading '/' from absolute path names
# file: bricks/brick3/day4/dir1/300mbfile

Moving it to verified

Comment 12 Ravishankar N 2018-10-11 05:38:28 UTC
Looks good to me.

Comment 14 errata-xmlrpc 2018-10-31 08:46:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.