Description of problem: ======================= On a nfs-ganesha setup, while rm -rf and remove-brick operation are in-progress, we are seeing spurious split-brain observed error messages in rebalance logs. Rebalance logs error snippet: ============================= [2017-01-09 06:50:36.232738] E [MSGID: 108008] [afr-read-txn.c:80:afr_read_txn_refresh_done] 0-distrep-replicate-6: Failing GETXATTR on gfid 5ab6a290-3127-4662-86e7-c52d32949c67: split-brain observed. [Input/output error] [2017-01-09 06:50:36.244473] E [MSGID: 108008] [afr-read-txn.c:80:afr_read_txn_refresh_done] 0-distrep-replicate-6: Failing STAT on gfid 5ab6a290-3127-4662-86e7-c52d32949c67: split-brain observed. [Input/output error] [2017-01-09 06:50:38.930970] E [MSGID: 108008] [afr-read-txn.c:80:afr_read_txn_refresh_done] 0-distrep-replicate-8: Failing GETXATTR on gfid 000feb2a-2a8f-40f1-ae9e-926f0d0ae323: split-brain observed. [Input/output error] [2017-01-09 06:50:38.944043] E [MSGID: 108008] [afr-read-txn.c:80:afr_read_txn_refresh_done] 0-distrep-replicate-8: Failing STAT on gfid 000feb2a-2a8f-40f1-ae9e-926f0d0ae323: split-brain observed. [Input/output error] [2017-01-09 06:50:43.595767] E [MSGID: 108008] [afr-read-txn.c:80:afr_read_txn_refresh_done] 0-distrep-replicate-8: Failing GETXATTR on gfid a6f9d15e-969b-4630-867d-d7a402f242b2: split-brain observed. [Input/output error] [2017-01-09 06:50:43.611669] E [MSGID: 108008] [afr-read-txn.c:80:afr_read_txn_refresh_done] 0-distrep-replicate-8: Failing STAT on gfid a6f9d15e-969b-4630-867d-d7a402f242b2: split-brain observed. [Input/output error] [2017-01-09 06:50:46.798033] E [MSGID: 108008] [afr-read-txn.c:80:afr_read_txn_refresh_done] 0-distrep-replicate-6: Failing GETXATTR on gfid b0a4fef7-bd4c-472f-9027-eb6aef268e29: split-brain observed. [Input/output error] [2017-01-09 06:50:46.810447] E [MSGID: 108008] [afr-read-txn.c:80:afr_read_txn_refresh_done] 0-distrep-replicate-6: Failing STAT on gfid b0a4fef7-bd4c-472f-9027-eb6aef268e29: split-brain observed. [Input/output error] Version-Release number of selected component (if applicable): 3.8.4-10.el7rhgs.x86_64 Steps to Reproduce: =================== 1) Create ganesha cluster and create a distributed-replicate volume. 2) Enable nfs-ganesha on the volume with mdcache settings. 3) Mount the volume. 4) Create files and folders. 5) From mount point, issue rm -rf * and start removing bricks. We can see split-brain error messages in rebalance logs. Actual results: =============== During rebalance, spurious split-brain error messages are seen in rebalance logs. Expected results: ================= There should not be any split-brain error messages as actually no split-brain has occurred.
upstream mainline patch http://review.gluster.org/16362 posted for review.
https://code.engineering.redhat.com/gerrit/#/c/94936/ <-- d/s patch
Verified this BZ on glusterfs version 3.8.4-13.el7rhgs.x86_64. Steps: 1) Created a ganesha cluster and created a distributed-replicate volume. 2) Enabled nfs-ganesha on the volume with mdcache settings. 3) Mounted the volume on multiple clients. 4) Created files and folders. 5) From mount point, issued rm -rf * and started removing bricks. I didn't see any split-brain messages in rebalance logs during rm -rf * + remove-brick. Hence, moving this BZ to Verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0486.html