Bug 1332949

Summary: Heal info shows split-brain for .shard directory though only one brick was down
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Bhaskarakiran <byarlaga>
Component: replicateAssignee: Anuradha <atalur>
Status: CLOSED ERRATA QA Contact: Bhaskarakiran <byarlaga>
Severity: unspecified Docs Contact:
Priority: high    
Version: rhgs-3.1CC: asrivast, byarlaga, mzywusko, pkarampu, ravishankar, rcyriac, rhinduja, rhs-bugs, sabose, sasundar, smohan, storage-qa-internal
Target Milestone: ---Keywords: ZStream
Target Release: RHGS 3.1.3   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.7.9-5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1335652 (view as bug list) Environment:
Last Closed: 2016-06-23 05:21:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1258386, 1311817, 1335652, 1335829, 1335836    

Description Bhaskarakiran 2016-05-04 12:12:21 UTC
Description of problem:
----------------------

In a x3 dist-rep volume, brought down one of the brick and continued the IO on VM's in ROBO environment. Heal info shows split-brain for .shard directory.

[root@rhsqa13 .shard]# gluster v heal vmstore info
Brick dhcp43-201.lab.eng.blr.redhat.com:/rhgs/vmstore/vms
/e1f49cf7-3857-4e82-92a3-1a50b365ed94/images/315d6a42-9bd9-4aa4-8df2-52d1561fd379/448e6cf8-c14f-4ce4-9035-746be8eaea80 
/.shard/d9787c75-c5de-4e25-9108-1887e341f8e5.20 
/__DIRECT_IO_TEST__ 
/.shard/e0f4ed43-a243-400f-8b0d-70609ac989a6.48 
/.shard/e0f4ed43-a243-400f-8b0d-70609ac989a6.14 
/e1f49cf7-3857-4e82-92a3-1a50b365ed94/images/8cb57f7b-e77b-4e9a-974a-df34ad7694dd/c054f6bc-c189-4de4-9bee-91b3f5fc5b6f 
/e1f49cf7-3857-4e82-92a3-1a50b365ed94/dom_md/ids 
/.shard/e0f4ed43-a243-400f-8b0d-70609ac989a6.12 
/e1f49cf7-3857-4e82-92a3-1a50b365ed94/master/tasks 
/.shard/7e6c25fd-b370-4309-be3c-46f74cb735a9.199 
/.shard/d9787c75-c5de-4e25-9108-1887e341f8e5.22 
/.shard/592a3396-3303-497d-b021-09526a866985.49 
/e1f49cf7-3857-4e82-92a3-1a50b365ed94/images/911edeb3-50e0-4d0a-9d97-453e83683e77/ddfd0fe2-1743-40c6-8304-5da5a806405f 
/.shard/592a3396-3303-497d-b021-09526a866985.51 
/.shard/e0f4ed43-a243-400f-8b0d-70609ac989a6.16 
/.shard/d9787c75-c5de-4e25-9108-1887e341f8e5.23 
/.shard/e0f4ed43-a243-400f-8b0d-70609ac989a6.17 
/.shard/d9787c75-c5de-4e25-9108-1887e341f8e5.24 
/.shard/e0f4ed43-a243-400f-8b0d-70609ac989a6.18 
/.shard/d9787c75-c5de-4e25-9108-1887e341f8e5.25 
/.shard/e0f4ed43-a243-400f-8b0d-70609ac989a6.19 
/.shard/d9787c75-c5de-4e25-9108-1887e341f8e5.26 
/.shard/e0f4ed43-a243-400f-8b0d-70609ac989a6.20 
/.shard/d9787c75-c5de-4e25-9108-1887e341f8e5.27 
/.shard/e0f4ed43-a243-400f-8b0d-70609ac989a6.21 
/.shard/d9787c75-c5de-4e25-9108-1887e341f8e5.28 
/.shard/e0f4ed43-a243-400f-8b0d-70609ac989a6.22 
/.shard/d9787c75-c5de-4e25-9108-1887e341f8e5.29 
/.shard/592a3396-3303-497d-b021-09526a866985.6 
/.shard/592a3396-3303-497d-b021-09526a866985.32 
/.shard/592a3396-3303-497d-b021-09526a866985.8 
/.shard/592a3396-3303-497d-b021-09526a866985.34 
/.shard/e0f4ed43-a243-400f-8b0d-70609ac989a6.6 
/.shard/592a3396-3303-497d-b021-09526a866985.1 
/.shard/e0f4ed43-a243-400f-8b0d-70609ac989a6.23 
/.shard/e0f4ed43-a243-400f-8b0d-70609ac989a6.24 
/.shard/d9787c75-c5de-4e25-9108-1887e341f8e5.32 
/.shard/e0f4ed43-a243-400f-8b0d-70609ac989a6.28 
/.shard/e0f4ed43-a243-400f-8b0d-70609ac989a6.32 
/e1f49cf7-3857-4e82-92a3-1a50b365ed94/images/336567a6-e16a-4c66-96c4-d71371875cee/d5a24c5c-dcdf-4ad0-ad71-9bb04d1d348f 
/.shard/d9787c75-c5de-4e25-9108-1887e341f8e5.33 
/.shard/e0f4ed43-a243-400f-8b0d-70609ac989a6.36 
/.shard/e0f4ed43-a243-400f-8b0d-70609ac989a6.88 
/.shard/592a3396-3303-497d-b021-09526a866985.10 
/.shard/d9787c75-c5de-4e25-9108-1887e341f8e5.34 
/.shard/e0f4ed43-a243-400f-8b0d-70609ac989a6.52 
/.shard/e0f4ed43-a243-400f-8b0d-70609ac989a6.56 
/.shard/e0f4ed43-a243-400f-8b0d-70609ac989a6.64 
/.shard/e0f4ed43-a243-400f-8b0d-70609ac989a6.68 
/.shard/d9787c75-c5de-4e25-9108-1887e341f8e5.35 
/.shard/e0f4ed43-a243-400f-8b0d-70609ac989a6.84 
/.shard/592a3396-3303-497d-b021-09526a866985.38 
/.shard/1b479e81-c0c9-4910-91da-8aac9482f940.37 
/.shard/1b479e81-c0c9-4910-91da-8aac9482f940.38 
/.shard/1b479e81-c0c9-4910-91da-8aac9482f940.39 
/.shard/1b479e81-c0c9-4910-91da-8aac9482f940.42 
/.shard/1b479e81-c0c9-4910-91da-8aac9482f940.43 
/.shard/1b479e81-c0c9-4910-91da-8aac9482f940.45 
/.shard/1b479e81-c0c9-4910-91da-8aac9482f940.49 
/.shard/e0f4ed43-a243-400f-8b0d-70609ac989a6.25 
/e1f49cf7-3857-4e82-92a3-1a50b365ed94/images/658f93a1-a6fb-41b5-a291-007c2509c15f 
/e1f49cf7-3857-4e82-92a3-1a50b365ed94/images 
/e1f49cf7-3857-4e82-92a3-1a50b365ed94/images/216015c4-34f7-48f5-86ab-308443667b35 
/.shard/592a3396-3303-497d-b021-09526a866985.14 
/.shard/1b479e81-c0c9-4910-91da-8aac9482f940.46 
/.shard/d9787c75-c5de-4e25-9108-1887e341f8e5.50 
/e1f49cf7-3857-4e82-92a3-1a50b365ed94/images/216015c4-34f7-48f5-86ab-308443667b35/3d3f6d3d-c85c-4f2f-987f-eb677b15f5a6 
/e1f49cf7-3857-4e82-92a3-1a50b365ed94/images/216015c4-34f7-48f5-86ab-308443667b35/3d3f6d3d-c85c-4f2f-987f-eb677b15f5a6.meta 
/e1f49cf7-3857-4e82-92a3-1a50b365ed94/images/216015c4-34f7-48f5-86ab-308443667b35/3d3f6d3d-c85c-4f2f-987f-eb677b15f5a6.lease 
/e1f49cf7-3857-4e82-92a3-1a50b365ed94/images/290f1664-99b6-406f-b468-7de8aea5335e 
/e1f49cf7-3857-4e82-92a3-1a50b365ed94/images/290f1664-99b6-406f-b468-7de8aea5335e/6477f964-e064-4ea0-8d63-807f7c8a63c8 
/e1f49cf7-3857-4e82-92a3-1a50b365ed94/images/290f1664-99b6-406f-b468-7de8aea5335e/6477f964-e064-4ea0-8d63-807f7c8a63c8.meta 
/.shard/d9787c75-c5de-4e25-9108-1887e341f8e5.21 
/.shard - Is in split-brain


Version-Release number of selected component (if applicable):
------------------------------------------------------------
3.7.9-2

How reproducible:
-----------------
100%

Steps to Reproduce:
As in description.


Actual results:


Expected results:


Additional info:
----------------
Sos reports will be attached.

Comment 2 Krutika Dhananjay 2016-05-05 03:36:15 UTC
Bhaskar, Rajesh Reddy, Kasturi and I were there when this bug was seen.
Couple of observations:
1) There was no split-brain as far as the AFR changelogs are concerned.
2) The dirty xattr value on /.shard was constantly increasing on the source bricks, as IO kept progressing.

Comment 3 Pranith Kumar K 2016-05-05 04:32:24 UTC
Krutika,
    Thanks for this information. If there is dirty marker, directory will be treated as conservative merge, since so sources will be set, it will be assumed as split-brain.

Bhaskar,
   Do you have sos-reports of this issue? We need to find out why dirty xattr keeps increasing.

Pranith

Comment 5 Anuradha 2016-05-13 10:23:56 UTC
Patch link : https://code.engineering.redhat.com/gerrit/#/c/74280/

Comment 9 Bhaskarakiran 2016-05-23 10:41:49 UTC
Marking this as fixed in glusterfs-3.7.9-5. verified and it doesn't show the split-brain for .shard directory.

Comment 11 errata-xmlrpc 2016-06-23 05:21:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1240