Bug 1460661

Summary: "split-brain observed [Input/output error]" error messages in samba logs during parallel rm -rf
Product: [Community] GlusterFS Reporter: Ravishankar N <ravishankar>
Component: replicateAssignee: Ravishankar N <ravishankar>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.8CC: bugs
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.8.13 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1456582 Environment:
Last Closed: 2017-06-29 09:55:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1449988, 1454689, 1456582    
Bug Blocks: 1457616, 1457732    

Description Ravishankar N 2017-06-12 11:04:03 UTC
+++ This bug was initially created as a clone of Bug #1456582 +++

+++ This bug was initially created as a clone of Bug #1454689 +++

Description of problem:
=======================
"split-brain observed [Input/output error]" error messages in samba logs during parallel rm -rf.

Version-Release number of selected component (if applicable):
3.8.4-25.el7rhgs.x86_64

How reproducible:
1/1

Steps to Reproduce:
===================
1) Create a distrbuted-replicate volume and start it.
2) Set nl-cache, parallel readdirp, md-cache settings on the volume. (Please see gluster volume distrep info below output for more info)
3) cifs mount on multiple clients.
4) Create a very large data set which contains small files, empty directories, directory with files.
5) Simultaneously issue rm -rf * from multiple clients.

Check samba logs and rm -rf * terminal.

Actual results:
===============
When issued rm -rf * command simultaneously from multiple clients, seeing "split-brain observed [Input/output error]" error messages in samba logs and rm -rf * throws permission denied errors as below,

rm: cannot remove ‘file_dstdir/10.70.47.52/thrd_02/d_005/d_005’: Permission denied
rm: cannot remove ‘file_dstdir/10.70.47.52/thrd_02/d_005/d_000’: Permission denied
rm: cannot remove ‘file_dstdir/10.70.47.52/thrd_00/d_002/d_005’: Permission denied
rm: cannot remove ‘file_dstdir/10.70.47.52/thrd_00/d_002/d_000’: Permission denied
rm: cannot remove ‘file_dstdir/10.70.47.52/thrd_04/d_001/d_000’: Permission denied
rm: cannot remove ‘file_srcdir/10.70.47.52/thrd_01/d_002/d_000’: Permission denied
rm: cannot remove ‘file_srcdir/10.70.47.15/thrd_05/d_003/d_000’: Permission denied
rm: cannot remove ‘new/file_dstdir/10.70.47.15/thrd_04/d_004/d_008’: Permission denied

Expected results:
=================
No split-brain errors in samba logs and rm -rf * should not throw any errors.

--- Additional comment from Red Hat Bugzilla Rules Engine on 2017-05-23 06:42:36 EDT ---

Samba logs split-brain output snippet:
======================================
[2017-05-23 10:20:06.930523] W [MSGID: 108008] [afr-read-txn.c:229:afr_read_txn] 4-distrep-replicate-2: Unreadable subvolume -1 found with event generation 2 for gfid 5c268304-8143-46c0-b4e4-561555fc003f. (Possible split-brain)
[2017-05-23 10:20:06.930704] E [MSGID: 108008] [afr-read-txn.c:90:afr_read_txn_refresh_done] 4-distrep-replicate-2: Failing FSTAT on gfid 5c268304-8143-46c0-b4e4-561555fc003f: split-brain observed. [Input/output error]
[2017-05-23 10:20:06.931078] W [MSGID: 108008] [afr-read-txn.c:229:afr_read_txn] 4-distrep-replicate-3: Unreadable subvolume -1 found with event generation 2 for gfid 5c268304-8143-46c0-b4e4-561555fc003f. (Possible split-brain)
[2017-05-23 10:20:06.931114] E [MSGID: 108008] [afr-read-txn.c:90:afr_read_txn_refresh_done] 4-distrep-replicate-3: Failing FSTAT on gfid 5c268304-8143-46c0-b4e4-561555fc003f: split-brain observed. [Input/output error]
[2017-05-23 10:20:06.932575] E [MSGID: 108008] [afr-read-txn.c:90:afr_read_txn_refresh_done] 4-distrep-replicate-7: Failing FSTAT on gfid 5c268304-8143-46c0-b4e4-561555fc003f: split-brain observed. [Input/output error]
[2017-05-23 10:20:06.932753] E [MSGID: 108008] [afr-read-txn.c:90:afr_read_txn_refresh_done] 4-distrep-replicate-4: Failing FSTAT on gfid 5c268304-8143-46c0-b4e4-561555fc003f: split-brain observed. [Input/output error]
[2017-05-23 10:20:06.933165] E [MSGID: 108008] [afr-read-txn.c:90:afr_read_txn_refresh_done] 4-distrep-replicate-6: Failing FSTAT on gfid 5c268304-8143-46c0-b4e4-561555fc003f: split-brain observed. [Input/output error]
[2017-05-23 10:20:06.933356] E [MSGID: 108008] [afr-read-txn.c:90:afr_read_txn_refresh_done] 4-distrep-replicate-5: Failing FSTAT on gfid 5c268304-8143-46c0-b4e4-561555fc003f: split-brain observed. [Input/output error]
[2017-05-23 10:20:06.934668] W [MSGID: 108008] [afr-read-txn.c:229:afr_read_txn] 4-distrep-replicate-5: Unreadable subvolume -1 found with event generation 2 for gfid 5c268304-8143-46c0-b4e4-561555fc003f. (Possible split-brain)
[2017-05-23 10:20:06.936800] W [MSGID: 108008] [afr-read-txn.c:229:afr_read_txn] 4-distrep-replicate-6: Unreadable subvolume -1 found with event generation 2 for gfid 5c268304-8143-46c0-b4e4-561555fc003f. (Possible split-brain)
[2017-05-23 10:20:06.938764] W [MSGID: 108008] [afr-read-txn.c:229:afr_read_txn] 4-distrep-replicate-7: Unreadable subvolume -1 found with event generation 2 for gfid 5c268304-8143-46c0-b4e4-561555fc003f. (Possible split-brain)
[2017-05-23 10:20:36.839637] W [MSGID: 108008] [afr-read-txn.c:229:afr_read_txn] 4-distrep-replicate-2: Unreadable subvolume -1 found with event generation 2 for gfid cde39e6f-bdab-4a8c-84f5-4e2f9faebace. (Possible split-brain)
[2017-05-23 10:20:36.839703] E [MSGID: 108008] [afr-read-txn.c:90:afr_read_txn_refresh_done] 4-distrep-replicate-2: Failing FSTAT on gfid cde39e6f-bdab-4a8c-84f5-4e2f9faebace: split-brain observed. [Input/output error]
[2017-05-23 10:20:36.840962] W [MSGID: 108008] [afr-read-txn.c:229:afr_read_txn] 4-distrep-replicate-0: Unreadable subvolume -1 found with event generation 2 for gfid cde39e6f-bdab-4a8c-84f5-4e2f9faebace. (Possible split-brain)
[2017-05-23 10:20:36.841024] E [MSGID: 108008] [afr-read-txn.c:90:afr_read_txn_refresh_done] 4-distrep-replicate-0: Failing FSTAT on gfid cde39e6f-bdab-4a8c-84f5-4e2f9faebace: split-brain observed. [Input/output error]
[2017-05-23 10:20:36.841372] W [MSGID: 108008] [afr-read-txn.c:229:afr_read_txn] 4-distrep-replicate-1: Unreadable subvolume -1 found with event generation 2 for gfid cde39e6f-bdab-4a8c-84f5-4e2f9faebace. (Possible split-brain)
[2017-05-23 10:20:36.841425] E [MSGID: 108008] [afr-read-txn.c:90:afr_read_txn_refresh_done] 4-distrep-replicate-1: Failing FSTAT on gfid cde39e6f-bdab-4a8c-84f5-4e2f9faebace: split-brain observed. [Input/output error]
[2017-05-23 10:20:36.843287] E [MSGID: 108008] [afr-read-txn.c:90:afr_read_txn_refresh_done] 4-distrep-replicate-5: Failing FSTAT on gfid cde39e6f-bdab-4a8c-84f5-4e2f9faebace: split-brain observed. [Input/output error]
[2017-05-23 10:20:36.843445] E [MSGID: 108008] [afr-read-txn.c:90:afr_read_txn_refresh_done] 4-distrep-replicate-3: Failing FSTAT on gfid cde39e6f-bdab-4a8c-84f5-4e2f9faebace: split-brain observed. [Input/output error]
[2017-05-23 10:20:36.843561] E [MSGID: 108008] [afr-read-txn.c:90:afr_read_txn_refresh_done] 4-distrep-replicate-7: Failing FSTAT on gfid cde39e6f-bdab-4a8c-84f5-4e2f9faebace: split-brain observed. [Input/output error]
[2017-05-23 10:20:36.844088] W [MSGID: 108008] [afr-read-txn.c:229:afr_read_txn] 4-distrep-replicate-5: Unreadable subvolume -1 found with event generation 2 for gfid cde39e6f-bdab-4a8c-84f5-4e2f9faebace. (Possible split-brain)
[2017-05-23 10:20:36.844355] E [MSGID: 108008] [afr-read-txn.c:90:afr_read_txn_refresh_done] 4-distrep-replicate-6: Failing FSTAT on gfid cde39e6f-bdab-4a8c-84f5-4e2f9faebace: split-brain observed. [Input/output error]
[2017-05-23 10:20:36.853341] W [MSGID: 108008] [afr-read-txn.c:229:afr_read_txn] 4-distrep-replicate-6: Unreadable subvolume -1 found with event generation 2 for gfid cde39e6f-bdab-4a8c-84f5-4e2f9faebace. (Possible split-brain)
[2017-05-23 10:20:36.864483] W [MSGID: 108008] [afr-read-txn.c:229:afr_read_txn] 4-distrep-replicate-7: Unreadable subvolume -1 found with event generation 2 for gfid cde39e6f-bdab-4a8c-84f5-4e2f9faebace. (Possible split-brain)

[root@dhcp47-127 samba]# gluster volume heal distrep info split-brain
Brick 10.70.47.127:/bricks/brick3/b3
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.46.181:/bricks/brick3/b3
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.46.47:/bricks/brick3/b3
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.47.140:/bricks/brick3/b3
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.47.127:/bricks/brick4/b4
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.46.181:/bricks/brick4/b4
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.46.47:/bricks/brick4/b4
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.47.140:/bricks/brick4/b4
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.47.127:/bricks/brick5/b5
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.46.181:/bricks/brick5/b5
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.46.47:/bricks/brick5/b5
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.47.140:/bricks/brick5/b5
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.47.127:/bricks/brick6/b6
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.46.181:/bricks/brick6/b6
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.46.47:/bricks/brick6/b6
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.47.140:/bricks/brick6/b6
Status: Connected
Number of entries in split-brain: 0

[root@dhcp47-127 samba]# gluster v info distrep
 
Volume Name: distrep
Type: Distributed-Replicate
Volume ID: 251fa596-4fdd-4e44-9b8c-4382b7139751
Status: Started
Snapshot Count: 0
Number of Bricks: 8 x 2 = 16
Transport-type: tcp
Bricks:
Brick1: 10.70.47.127:/bricks/brick3/b3
Brick2: 10.70.46.181:/bricks/brick3/b3
Brick3: 10.70.46.47:/bricks/brick3/b3
Brick4: 10.70.47.140:/bricks/brick3/b3
Brick5: 10.70.47.127:/bricks/brick4/b4
Brick6: 10.70.46.181:/bricks/brick4/b4
Brick7: 10.70.46.47:/bricks/brick4/b4
Brick8: 10.70.47.140:/bricks/brick4/b4
Brick9: 10.70.47.127:/bricks/brick5/b5
Brick10: 10.70.46.181:/bricks/brick5/b5
Brick11: 10.70.46.47:/bricks/brick5/b5
Brick12: 10.70.47.140:/bricks/brick5/b5
Brick13: 10.70.47.127:/bricks/brick6/b6
Brick14: 10.70.46.181:/bricks/brick6/b6
Brick15: 10.70.46.47:/bricks/brick6/b6
Brick16: 10.70.47.140:/bricks/brick6/b6
Options Reconfigured:
nfs.disable: on
transport.address-family: inet
performance.parallel-readdir: on
performance.rda-cache-limit: 10MB
features.uss: on
features.quota: on
features.inode-quota: on
features.quota-deem-statfs: on
features.cache-invalidation: on
features.cache-invalidation-timeout: 600
performance.stat-prefetch: on
performance.cache-invalidation: on
performance.md-cache-timeout: 600
network.inode-lru-limit: 50000
performance.nl-cache: on

[root@dhcp47-127 samba]# gluster v status distrep
Status of volume: distrep
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.47.127:/bricks/brick3/b3        49165     0          Y       9881 
Brick 10.70.46.181:/bricks/brick3/b3        49164     0          Y       4970 
Brick 10.70.46.47:/bricks/brick3/b3         49164     0          Y       10968
Brick 10.70.47.140:/bricks/brick3/b3        49163     0          Y       29273
Brick 10.70.47.127:/bricks/brick4/b4        49166     0          Y       9901 
Brick 10.70.46.181:/bricks/brick4/b4        49165     0          Y       5003 
Brick 10.70.46.47:/bricks/brick4/b4         49165     0          Y       10990
Brick 10.70.47.140:/bricks/brick4/b4        49164     0          Y       29294
Brick 10.70.47.127:/bricks/brick5/b5        49168     0          Y       4735 
Brick 10.70.46.181:/bricks/brick5/b5        49167     0          Y       26844
Brick 10.70.46.47:/bricks/brick5/b5         49167     0          Y       832  
Brick 10.70.47.140:/bricks/brick5/b5        49166     0          Y       19171
Brick 10.70.47.127:/bricks/brick6/b6        49169     0          Y       11755
Brick 10.70.46.181:/bricks/brick6/b6        49168     0          Y       16967
Brick 10.70.46.47:/bricks/brick6/b6         49168     0          Y       23623
Brick 10.70.47.140:/bricks/brick6/b6        49167     0          Y       9394 
Snapshot Daemon on localhost                49167     0          Y       12337
Self-heal Daemon on localhost               N/A       N/A        Y       7828 
Quota Daemon on localhost                   N/A       N/A        Y       7847 
Snapshot Daemon on dhcp46-47.lab.eng.blr.re
dhat.com                                    49166     0          Y       12714
Self-heal Daemon on dhcp46-47.lab.eng.blr.r
edhat.com                                   N/A       N/A        Y       5839 
Quota Daemon on dhcp46-47.lab.eng.blr.redha
t.com                                       N/A       N/A        Y       5861 
Snapshot Daemon on dhcp46-181.lab.eng.blr.r
edhat.com                                   49166     0          Y       6657 
Self-heal Daemon on dhcp46-181.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       31907
Quota Daemon on dhcp46-181.lab.eng.blr.redh
at.com                                      N/A       N/A        Y       31931
Snapshot Daemon on dhcp47-140.lab.eng.blr.r
edhat.com                                   49165     0          Y       31098
Self-heal Daemon on dhcp47-140.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       23989
Quota Daemon on dhcp47-140.lab.eng.blr.redh
at.com                                      N/A       N/A        Y       24002
 
Task Status of Volume distrep
------------------------------------------------------------------------------
There are no active volume tasks

--- Additional comment from Ravishankar N on 2017-05-23 08:48:21 EDT ---

From a quick look at both samba logs, I see that fstat_cbk has failed with EACCESS, example:
19059:[2017-05-23 09:13:13.441530] W [MSGID: 114031] [client-rpc-fops.c:1495:client3_3_fstat_cbk] 4-distrep-client-8: remote operation failed [Operation not permitted]
19061:[2017-05-23 09:13:13.441635] W [MSGID: 114031] [client-rpc-fops.c:1495:client3_3_fstat_cbk] 4-distrep-client-12: remote operation failed [Operation not permitted]
19066:[2017-05-23 09:13:13.453797] W [MSGID: 114031] [client-rpc-fops.c:1495:client3_3_fstat_cbk] 4-distrep-client-4: remote operation failed [Operation not permitted]


The corresponding brick logs also have the error.
1290:[2017-05-23 09:12:33.023868] W [MSGID: 113006] [posix.c:6280:posix_do_readdir] 0-distrep-posix: pfd is NULL, fd=0x7f1ea0201ad0 [Operation not permitted]
1308:[2017-05-23 09:13:06.032149] E [MSGID: 115081] [server-rpc-fops.c:1203:server_fstat_cbk] 0-distrep-server: 614392: FSTAT -2 (dd285c20-ab48-4d17-88f0-50db96c9affc) ==> (Operation not permitted) [Operation not permitted]
1309:[2017-05-23 09:13:06.038226] W [MSGID: 113006] [posix.c:6280:posix_do_readdir] 0-distrep-posix: pfd is NULL, fd=0x7f1ea02d3b00 [Operation not permitted]
1316:[2017-05-23 09:13:13.435376] E [MSGID: 115081] [server-rpc-fops.c:1203:server_fstat_cbk] 0-distrep-server: 576531: FSTAT -2 (232e37a8-30ea-4efe-962f-bbb413fc31de) ==> (Operation not permitted) [Op

1. These seem to be the cause of the errors seen by the application. We need to find why that is.

2. Though possibly unrelated, we also need to see why AFR throws EIO after inode refresh without any files in split-brain.

--- Additional comment from Worker Ant on 2017-05-29 12:46:21 EDT ---

REVIEW: https://review.gluster.org/17413 (afr: add errno to afr_inode_refresh_done()) posted (#1) for review on master by Ravishankar N (ravishankar)

--- Additional comment from Worker Ant on 2017-05-29 12:47:05 EDT ---

REVIEW: https://review.gluster.org/17414 (posix: use the correct op_errno) posted (#1) for review on master by Ravishankar N (ravishankar)

--- Additional comment from Worker Ant on 2017-05-30 21:50:39 EDT ---

COMMIT: https://review.gluster.org/17413 committed in master by Pranith Kumar Karampuri (pkarampu) 
------
commit feaea7fa541b81a4988b8f394037bfedb5017f4c
Author: Ravishankar N <ravishankar>
Date:   Mon May 29 21:56:12 2017 +0530

    afr: add errno to afr_inode_refresh_done()
    
    Problem:
    When parellel `rm -rf`s were being done from cifs clients, opendir might
    fail on some replicas with ENOENT. DHT ignores partial opendir failures
    in dht_fd_cbk() and winds readdirs on those replicas. Afr inode refresh
    (as a part of readdirp read_txn) sees in its fd context that the state
    of the fds is *not* AFR_FD_OPENED and bails out to
    afr_inode_refresh_done() without doing a refresh. When this happens, the
    errno is set as EIO due to lack of readable subvols, logging split-brain
    messages in the logs.
    
    Fix:
    Introduce an errno argument to afr_inode_refresh_do() to bail out with
    the right error value when inode refresh is not performed.
    
    Change-Id: I075707fbb73fd93a923b77b923a96aac79e847f9
    BUG: 1456582
    Signed-off-by: Ravishankar N <ravishankar>
    Reviewed-on: https://review.gluster.org/17413
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>

--- Additional comment from Worker Ant on 2017-05-31 02:54:44 EDT ---

REVIEW: https://review.gluster.org/17414 (posix: use the correct op_errno) posted (#2) for review on master by Ravishankar N (ravishankar)

--- Additional comment from Worker Ant on 2017-05-31 10:34:50 EDT ---

COMMIT: https://review.gluster.org/17414 committed in master by Pranith Kumar Karampuri (pkarampu) 
------
commit de92c363c95d16966dbcc9d8763fd4448dd84d13
Author: Ravishankar N <ravishankar>
Date:   Mon May 29 21:38:14 2017 +0530

    posix: use the correct op_errno
    
    Problem:
    If readdir/fstat was performed on a directory that was removed,
    posix_fd_ctx_get() fails with ENOENT but we incorrectly use the ret
    value (-1 in this case) as op_errno, logging "Operation not permitted"
    messages in the brick logs. Also in case of fstat, the -1 op_errno was
    also propagated to the client via stack unwind, causing the message to
    appear in protocol/client logs as well.
    
    Fix:
    Use the right op_errno in readdir, fstat and writev. Also, if
    posix_fd_ctx_get() failed with ENOENT, convert it into EBADF because
    ENOENT is not a valid error for an fd operation.
    
    Change-Id: Ie43c0789d5040ec73b7cf885d015a183b8c64d70
    BUG: 1456582
    Signed-off-by: Ravishankar N <ravishankar>
    Reviewed-on: https://review.gluster.org/17414
    Smoke: Gluster Build System <jenkins.org>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>
    Tested-by: Pranith Kumar Karampuri <pkarampu>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Amar Tumballi <amarts>
    CentOS-regression: Gluster Build System <jenkins.org>

--- Additional comment from Ravishankar N on 2017-06-01 04:59:53 EDT ---

One more patch being sent.

--- Additional comment from Worker Ant on 2017-06-01 05:00:24 EDT ---

REVIEW: https://review.gluster.org/17436 (afr: update errno check in afr_inode_refresh_do) posted (#1) for review on master by Ravishankar N (ravishankar)

--- Additional comment from Worker Ant on 2017-06-04 22:34:00 EDT ---

COMMIT: https://review.gluster.org/17436 committed in master by Pranith Kumar Karampuri (pkarampu) 
------
commit 9aa81c15a429dc2817f0149a33b6a9e88ead8110
Author: Ravishankar N <ravishankar>
Date:   Thu Jun 1 14:20:03 2017 +0530

    afr: update errno check in afr_inode_refresh_do
    
    Addresses review comment in https://review.gluster.org/#/c/17413
    
    Change-Id: Ic247729e5e92a5bb0148543764e0b30790444004
    BUG: 1456582
    Signed-off-by: Ravishankar N <ravishankar>
    Reviewed-on: https://review.gluster.org/17436
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>

Comment 1 Worker Ant 2017-06-12 11:05:15 UTC
REVIEW: https://review.gluster.org/17517 (posix: use the correct op_errno) posted (#1) for review on release-3.8 by Ravishankar N (ravishankar)

Comment 2 Worker Ant 2017-06-12 11:06:16 UTC
REVIEW: https://review.gluster.org/17518 (afr: add errno to afr_inode_refresh_done()) posted (#1) for review on release-3.8 by Ravishankar N (ravishankar)

Comment 3 Worker Ant 2017-06-19 04:54:49 UTC
COMMIT: https://review.gluster.org/17518 committed in release-3.8 by jiffin tony Thottan (jthottan) 
------
commit 5d13784dc6aa406f69061f2608a19ef0c8a80581
Author: Ravishankar N <ravishankar>
Date:   Mon Jun 5 09:40:51 2017 +0530

    afr: add errno to afr_inode_refresh_done()
    
    Backport of https://review.gluster.org/17413 and
    https://review.gluster.org/17436
    
    Problem:
    When parellel `rm -rf`s were being done from cifs clients, opendir might
    fail on some replicas with ENOENT. DHT ignores partial opendir failures
    in dht_fd_cbk() and winds readdirs on those replicas. Afr inode refresh
    (as a part of readdirp read_txn) sees in its fd context that the state
    of the fds is *not* AFR_FD_OPENED and bails out to
    afr_inode_refresh_done() without doing a refresh. When this happens, the
    errno is set as EIO due to lack of readable subvols, logging split-brain
    messages in the logs.
    
    Fix:
    Introduce an errno argument to afr_inode_refresh_do() to bail out with
    the right error value when inode refresh is not performed.
    
    Change-Id: I8eed4d6e6c85332c1f5813c74cb54ae73693a369
    BUG: 1460661
    Signed-off-by: Ravishankar N <ravishankar>
    Reviewed-on: https://review.gluster.org/17518
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>

Comment 4 Worker Ant 2017-06-19 10:35:14 UTC
REVIEW: https://review.gluster.org/17517 (posix: use the correct op_errno) posted (#2) for review on release-3.8 by Ravishankar N (ravishankar)

Comment 5 Worker Ant 2017-06-20 04:47:56 UTC
COMMIT: https://review.gluster.org/17517 committed in release-3.8 by jiffin tony Thottan (jthottan) 
------
commit 0903b76c88ed47d819372763fdccbe1486bf4943
Author: Ravishankar N <ravishankar>
Date:   Mon May 29 21:38:14 2017 +0530

    posix: use the correct op_errno
    
    Problem:
    If readdir/fstat was performed on a directory that was removed,
    posix_fd_ctx_get() fails with ENOENT but we incorrectly use the ret
    value (-1 in this case) as op_errno, logging "Operation not permitted"
    messages in the brick logs. Also in case of fstat, the -1 op_errno was
    also propagated to the client via stack unwind, causing the message to
    appear in protocol/client logs as well.
    
    Fix:
    Use the right op_errno in readdir, fstat and writev.
    A̶l̶s̶o̶,̶ ̶i̶f̶ p̶o̶s̶i̶x̶_̶f̶d̶_̶c̶t̶x̶_̶g̶e̶t̶(̶)̶ ̶f̶a̶i̶l̶e̶d̶ ̶w̶i̶t̶h̶ ̶E̶N̶O̶E̶N̶T̶,̶ ̶c̶o̶n̶v̶e̶r̶t̶ ̶i̶t̶ ̶i̶n̶t̶o̶ ̶E̶B̶A̶D̶F̶ ̶b̶e̶c̶a̶u̶s̶e̶ ̶E̶N̶O̶E̶N̶T̶
    i̶s̶ ̶n̶o̶t̶ ̶a̶ ̶v̶a̶l̶i̶d̶ ̶e̶r̶r̶o̶r̶ ̶f̶o̶r̶ ̶a̶n̶ ̶f̶d̶ ̶o̶p̶e̶r̶a̶t̶i̶o̶n̶.̶ Don't do this as it breaks
    DHT.See 17565
    
    > Reviewed-on: https://review.gluster.org/17414
    > Smoke: Gluster Build System <jenkins.org>
    > Reviewed-by: Pranith Kumar Karampuri <pkarampu>
    > Tested-by: Pranith Kumar Karampuri <pkarampu>
    > NetBSD-regression: NetBSD Build System <jenkins.org>
    > Reviewed-by: Amar Tumballi <amarts>
    > CentOS-regression: Gluster Build System <jenkins.org>
    (cherry picked from commit de92c363c95d16966dbcc9d8763fd4448dd84d13)
    
    Change-Id: Ie43c0789d5040ec73b7cf885d015a183b8c64d70
    BUG: 1460661
    Signed-off-by: Ravishankar N <ravishankar>
    Reviewed-on: https://review.gluster.org/17517
    Smoke: Gluster Build System <jenkins.org>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>

Comment 6 Niels de Vos 2017-06-29 09:55:42 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.13, please open a new bug report.

glusterfs-3.8.13 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://lists.gluster.org/pipermail/announce/2017-June/000075.html
[2] https://www.gluster.org/pipermail/gluster-users/