Bug 1411617 - Spurious split-brain error messages are seen in rebalance logs
Summary: Spurious split-brain error messages are seen in rebalance logs
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: replicate
Version: rhgs-3.2
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: RHGS 3.2.0
Assignee: Krutika Dhananjay
QA Contact: Prasad Desala
URL:
Whiteboard:
Depends On:
Blocks: 1351528 1411625 1412914 1412915
TreeView+ depends on / blocked
 
Reported: 2017-01-10 06:50 UTC by Prasad Desala
Modified: 2017-03-23 06:02 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.8.4-12
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1411625 (view as bug list)
Environment:
Last Closed: 2017-03-23 06:02:21 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:0486 normal SHIPPED_LIVE Moderate: Red Hat Gluster Storage 3.2.0 security, bug fix, and enhancement update 2017-03-23 09:18:45 UTC

Description Prasad Desala 2017-01-10 06:50:48 UTC
Description of problem:
=======================
On a nfs-ganesha setup, while rm -rf and remove-brick operation are in-progress, we are seeing spurious split-brain observed error messages in rebalance logs.

Rebalance logs error snippet:
=============================
[2017-01-09 06:50:36.232738] E [MSGID: 108008] [afr-read-txn.c:80:afr_read_txn_refresh_done] 0-distrep-replicate-6: Failing GETXATTR on gfid 5ab6a290-3127-4662-86e7-c52d32949c67: split-brain observed. [Input/output error]
[2017-01-09 06:50:36.244473] E [MSGID: 108008] [afr-read-txn.c:80:afr_read_txn_refresh_done] 0-distrep-replicate-6: Failing STAT on gfid 5ab6a290-3127-4662-86e7-c52d32949c67: split-brain observed. [Input/output error]
[2017-01-09 06:50:38.930970] E [MSGID: 108008] [afr-read-txn.c:80:afr_read_txn_refresh_done] 0-distrep-replicate-8: Failing GETXATTR on gfid 000feb2a-2a8f-40f1-ae9e-926f0d0ae323: split-brain observed. [Input/output error]
[2017-01-09 06:50:38.944043] E [MSGID: 108008] [afr-read-txn.c:80:afr_read_txn_refresh_done] 0-distrep-replicate-8: Failing STAT on gfid 000feb2a-2a8f-40f1-ae9e-926f0d0ae323: split-brain observed. [Input/output error]
[2017-01-09 06:50:43.595767] E [MSGID: 108008] [afr-read-txn.c:80:afr_read_txn_refresh_done] 0-distrep-replicate-8: Failing GETXATTR on gfid a6f9d15e-969b-4630-867d-d7a402f242b2: split-brain observed. [Input/output error]
[2017-01-09 06:50:43.611669] E [MSGID: 108008] [afr-read-txn.c:80:afr_read_txn_refresh_done] 0-distrep-replicate-8: Failing STAT on gfid a6f9d15e-969b-4630-867d-d7a402f242b2: split-brain observed. [Input/output error]
[2017-01-09 06:50:46.798033] E [MSGID: 108008] [afr-read-txn.c:80:afr_read_txn_refresh_done] 0-distrep-replicate-6: Failing GETXATTR on gfid b0a4fef7-bd4c-472f-9027-eb6aef268e29: split-brain observed. [Input/output error]
[2017-01-09 06:50:46.810447] E [MSGID: 108008] [afr-read-txn.c:80:afr_read_txn_refresh_done] 0-distrep-replicate-6: Failing STAT on gfid b0a4fef7-bd4c-472f-9027-eb6aef268e29: split-brain observed. [Input/output error]


Version-Release number of selected component (if applicable):
3.8.4-10.el7rhgs.x86_64

Steps to Reproduce:
===================
1) Create ganesha cluster and create a distributed-replicate volume.
2) Enable nfs-ganesha on the volume with mdcache settings.
3) Mount the volume.
4) Create files and folders.
5) From mount point, issue rm -rf * and start removing bricks.

We can see split-brain error messages in rebalance logs.

Actual results:
===============
During rebalance, spurious split-brain error messages are seen in rebalance logs.

Expected results:
=================
There should not be any split-brain error messages as actually no split-brain has occurred.

Comment 4 Atin Mukherjee 2017-01-11 04:08:57 UTC
upstream mainline patch http://review.gluster.org/16362 posted for review.

Comment 5 Krutika Dhananjay 2017-01-13 05:35:40 UTC
https://code.engineering.redhat.com/gerrit/#/c/94936/ <-- d/s patch

Comment 7 Prasad Desala 2017-02-01 06:51:44 UTC
Verified this BZ on glusterfs version 3.8.4-13.el7rhgs.x86_64.

Steps:
1) Created a ganesha cluster and created a distributed-replicate volume.
2) Enabled nfs-ganesha on the volume with mdcache settings.
3) Mounted the volume on multiple clients.
4) Created files and folders.
5) From mount point, issued rm -rf * and started removing bricks.

I didn't see any split-brain messages in rebalance logs during rm -rf * + remove-brick. Hence, moving this BZ to Verified.

Comment 9 errata-xmlrpc 2017-03-23 06:02:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0486.html


Note You need to log in before you can comment on or make changes to this bug.