Bug 1409572

Summary: In fuse mount logs:seeing input/output error with split-brain observed logs and failing GETXATTR and STAT
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Nag Pavan Chilakam <nchilaka>
Component: replicateAssignee: Mohammed Rafi KC <rkavunga>
Status: CLOSED WORKSFORME QA Contact: Nag Pavan Chilakam <nchilaka>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: rhgs-3.2CC: amukherj, asrivast, nchilaka, ravishankar, rcyriac, rhinduja, rhs-bugs, storage-qa-internal
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-04-28 07:01:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Nag Pavan Chilakam 2017-01-02 13:52:23 UTC
Description of problem:
======================

On my systemic setup, I am doing same path directory creation simultaneously from 3 different clients.
Each client used different server IP to mount the volume using fuse protocol
Also, Each client were dumping sosreports every 5min into the volume mount in a screen session, along with top output being appended to a file every minute
The dir-creations were happening from different users
Eg:
client1(el 7.2) was running the dir-creation using pavan@rhs-client23
client2(el 6.7) as root@rhs-client24
client3(el 7.3) as cli21@rhs-client21


Note: these logs are wrt client1 ie rhs-client23
Also, however note that I am able to access the mount

I am seeing below logs:
[2017-01-02 12:49:04.209279] E [MSGID: 108008] [afr-transaction.c:2602:afr_write_txn_refresh_done] 0-sysvol-replicate-3: Failing WRITE on gfid 2fa5d60d-9536-4d50-a1e9-5e
512b9bee08: split-brain observed.
[2017-01-02 12:49:04.209330] W [fuse-bridge.c:1291:fuse_err_cbk] 0-glusterfs-fuse: 97459985: FLUSH() ERR => -1 (Input/output error)
[2017-01-02 12:49:04.211225] W [MSGID: 108008] [afr-read-txn.c:239:afr_read_txn] 0-sysvol-replicate-3: Unreadable subvolume -1 found with event generation 376 for gfid 2fa5d60d-9536-4d50-a1e9-5e512b9bee08. (Possible split-brain)
[2017-01-02 12:49:04.213489] E [MSGID: 108008] [afr-read-txn.c:80:afr_read_txn_refresh_done] 0-sysvol-replicate-3: Failing STAT on gfid 2fa5d60d-9536-4d50-a1e9-5e512b9bee08: split-brain observed. [Input/output error]
[2017-01-02 12:49:04.214165] W [MSGID: 108008] [afr-read-txn.c:239:afr_read_txn] 0-sysvol-replicate-3: Unreadable subvolume -1 found with event generation 376 for gfid 2fa5d60d-9536-4d50-a1e9-5e512b9bee08. (Possible split-brain)
[2017-01-02 12:49:04.221112] E [MSGID: 108008] [afr-read-txn.c:80:afr_read_txn_refresh_done] 0-sysvol-replicate-3: Failing GETXATTR on gfid 2fa5d60d-9536-4d50-a1e9-5e512b9bee08: split-brain observed. [Input/output error]
[2017-01-02 12:49:04.810823] W [MSGID: 108027] [afr-common.c:2589:afr_discover_done] 0-sysvol-replicate-3: no read subvols for /data/same-dir-creation/dir.rhs-client23.lab.eng.blr.redhat.com.log
[2017-01-02 12:49:04.810934] W [MSGID: 108008] [afr-read-txn.c:239:afr_read_txn] 0-sysvol-replicate-3: Unreadable subvolume -1 found with event generation 376 for gfid 2fa5d60d-9536-4d50-a1e9-5e512b9bee08. (Possible split-brain)
[2017-01-02 12:49:04.812838] E [MSGID: 108008] [afr-read-txn.c:80:afr_read_txn_refresh_done] 0-sysvol-replicate-3: Failing STAT on gfid 2fa5d60d-9536-4d50-a1e9-5e512b9bee08: split-brain observed. [Input/output error]
[2017-01-02 12:49:04.814947] W [MSGID: 108008] [afr-read-txn.c:239:afr_read_txn] 0-sysvol-replicate-3: Unreadable subvolume -1 found with event generation 376 for gfid 2fa5d60d-9536-4d50-a1e9-5e512b9bee08. (Possible split-brain)
[2017-01-02 12:49:04.826529] E [MSGID: 108008] [afr-read-txn.c:80:afr_read_txn_refresh_done] 0-sysvol-replicate-3: Failing STAT on gfid 2fa5d60d-9536-4d50-a1e9-5e512b9bee08: split-brain observed. [Input/output error]
[2017-01-02 12:49:04.827121] W [MSGID: 108008] [afr-read-txn.c:239:afr_read_txn] 0-sysvol-replicate-3: Unreadable subvolume -1 found with event generation 376 for gfid 2fa5d60d-9536-4d50-a1e9-5e512b9bee08. (Possible split-brain)
[2017-01-02 12:49:04.828029] E [MSGID: 108008] [afr-read-txn.c:80:afr_read_txn_refresh_done] 0-sysvol-replicate-3: Failing GETXATTR on gfid 2fa5d60d-9536-4d50-a1e9-5e512b9bee08: split-brain observed. [Input/output error]
[2017-01-02 12:49:04.830158] W [MSGID: 108027] [afr-common.c:2589:afr_discover_done] 0-sysvol-replicate-3: no read subvols for /data/same-dir-creation/dir.rhs-client23.lab.eng.blr.redhat.com.log
[2017-01-02 12:49:04.830299] W [MSGID: 108008] [afr-read-txn.c:239:afr_read_txn] 0-sysvol-replicate-3: Unreadable subvolume -1 found with event generation 376 for gfid 2fa5d60d-9536-4d50-a1e9-5e512b9bee08. (Possible split-brain)
[2017-01-02 12:49:04.831159] E [MSGID: 108008] [afr-read-txn.c:80:afr_read_txn_refresh_done] 0-sysvol-replicate-3: Failing STAT on gfid 2fa5d60d-9536-4d50-a1e9-5e512b9bee08: split-brain observed. [Input/output error]
[2017-01-02 12:49:04.833027] W [MSGID: 108008] [afr-read-txn.c:239:afr_read_txn] 0-sysvol-replicate-3: Unreadable subvolume -1 found with event generation 376 for gfid 2fa5d60d-9536-4d50-a1e9-5e512b9bee08. (Possible split-brain)
[2017-01-02 12:49:04.833890] E [MSGID: 108008] [afr-read-txn.c:80:afr_read_txn_refresh_done] 0-sysvol-replicate-3: Failing STAT on gfid 2fa5d60d-9536-4d50-a1e9-5e512b9bee08: split-brain observed. [Input/output error]
[2017-01-02 12:49:04.834378] W [MSGID: 108008] [afr-read-txn.c:239:afr_read_txn] 0-sysvol-replicate-3: Unreadable subvolume -1 found with event generation 376 for gfid 2fa5d60d-9536-4d50-a1e9-5e512b9bee08. (Possible split-brain)
[2017-01-02 12:49:04.876508] E [MSGID: 108008] [afr-read-txn.c:80:afr_read_txn_refresh_done] 0-sysvol-replicate-3: Failing GETXATTR on gfid 2fa5d60d-9536-4d50-a1e9-5e512b9bee08: split-brain observed. [Input/output error]
[2017-01-02 12:49:04.886720] W [MSGID: 108027] [afr-common.c:2589:afr_discover_done] 0-sysvol-replicate-3: no read subvols for /data/same-dir-creation/dir.rhs-client23.lab.eng.blr.redhat.com.log
[2017-01-02 12:49:04.886865] W [MSGID: 108008] [afr-read-txn.c:239:afr_read_txn] 0-sysvol-replicate-3: Unreadable subvolume -1 found with event generation 376 for gfid 2fa5d60d-9536-4d50-a1e9-5e512b9bee08. (Possible split-brain)

sosreports available at 
http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/nchilaka/3.2_logs/systemic_testing_logs/regression_cycle/same_dir_create_clients/rhs-client23.lab.eng.blr.redhat.com/

test execution details available at https://docs.google.com/spreadsheets/d/1iP5Mi1TewBFVh8HTmlcBm9072Bgsbgkr3CLcGmawDys/edit#gid=632186609

Version-Release number of selected component (if applicable):
============
3.8.4-10





note that 2 bricks are down for which seperate bz has been raised as below which are related to same setup
1409472 - brick crashed on systemic setup
1397907 - seeing frequent kernel hangs when doing operations both on fuse client and gluster nodes on replica volumes (edit) [NEEDINFO]
1409568 - seeing socket disconnects and transport endpoint not connected frequently on systemic setup 

But i am seeing disconnects even for bricks which are not down

[root@dhcp35-20 ~]# gluster v status
Status of volume: sysvol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.20:/rhs/brick1/sysvol        49154     0          Y       2404 
Brick 10.70.37.86:/rhs/brick1/sysvol        49154     0          Y       26789
Brick 10.70.35.156:/rhs/brick1/sysvol       N/A       N/A        N       N/A  
Brick 10.70.37.154:/rhs/brick1/sysvol       49154     0          Y       2192 
Brick 10.70.35.20:/rhs/brick2/sysvol        49155     0          Y       2424 
Brick 10.70.37.86:/rhs/brick2/sysvol        N/A       N/A        N       N/A  
Brick 10.70.35.156:/rhs/brick2/sysvol       49155     0          Y       26793
Brick 10.70.37.154:/rhs/brick2/sysvol       49155     0          Y       2212 
Snapshot Daemon on localhost                49156     0          Y       2449 
Self-heal Daemon on localhost               N/A       N/A        Y       3131 
Quota Daemon on localhost                   N/A       N/A        Y       3139 
Snapshot Daemon on 10.70.37.86              49156     0          Y       26832
Self-heal Daemon on 10.70.37.86             N/A       N/A        Y       2187 
Quota Daemon on 10.70.37.86                 N/A       N/A        Y       2195 
Snapshot Daemon on 10.70.35.156             49156     0          Y       26816
Self-heal Daemon on 10.70.35.156            N/A       N/A        Y       1376 
Quota Daemon on 10.70.35.156                N/A       N/A        Y       1384 
Snapshot Daemon on 10.70.37.154             49156     0          Y       2235 
Self-heal Daemon on 10.70.37.154            N/A       N/A        Y       9321 
Quota Daemon on 10.70.37.154                N/A       N/A        Y       9329 
 
Task Status of Volume sysvol
------------------------------------------------------------------------------
There are no active volume tasks
 
[root@dhcp35-20 ~]# gluster v info
 
Volume Name: sysvol
Type: Distributed-Replicate
Volume ID: 4efd4f77-85c7-4eb9-b958-6769d31d84c8
Status: Started
Snapshot Count: 0
Number of Bricks: 4 x 2 = 8
Transport-type: tcp
Bricks:
Brick1: 10.70.35.20:/rhs/brick1/sysvol
Brick2: 10.70.37.86:/rhs/brick1/sysvol
Brick3: 10.70.35.156:/rhs/brick1/sysvol
Brick4: 10.70.37.154:/rhs/brick1/sysvol
Brick5: 10.70.35.20:/rhs/brick2/sysvol
Brick6: 10.70.37.86:/rhs/brick2/sysvol
Brick7: 10.70.35.156:/rhs/brick2/sysvol
Brick8: 10.70.37.154:/rhs/brick2/sysvol
Options Reconfigured:
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.stat-prefetch: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
features.uss: enable
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on

Comment 2 Nag Pavan Chilakam 2017-01-04 07:13:15 UTC
client sosreports are available at scp -r /var/tmp/$HOSTNAME qe@rhsqe-repo:/var/www/html/sosreports/nchilaka/3.2_logs/systemic_testing_logs/regression_cycle/same_dir_create_clients/

Comment 4 Nag Pavan Chilakam 2017-01-20 10:24:14 UTC
hitting this even on 3.8.4-12

Comment 5 Mohammed Rafi KC 2017-01-30 07:33:59 UTC
I went through the logs attached, and I could see number of split brain related messages. But There are lot of logs related to bricks unavailability as well. For further analysis, Is it possible to get logs for bricks ? Also the time when the tests started or the time where everything was perfect?

Comment 7 Mohammed Rafi KC 2017-04-03 07:21:42 UTC
putting the needinfo back for the same comment5

Comment 8 Nag Pavan Chilakam 2017-04-03 07:26:41 UTC
see these server logs, if they help, as they were reported for another bz around the same day
http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/nchilaka/bug.1409472/sosreports/

Comment 10 Mohammed Rafi KC 2017-04-21 07:06:57 UTC
I tried to reproduce this bug locally, though I couldn't create a systemic setup and the same workload. I tried with the normal IO's with 2*2 and 2 clients.

I was not successful in reproducing this bug. Since we don't have a proper reproducer i'm not planning to invest much time on this.

So I think best for now is to close this bug and if you see the problem again in latest systematic or non-systematic set, we can reopen this bug.

Let me know your thoughts?

Comment 11 Nag Pavan Chilakam 2017-04-28 06:59:38 UTC
(In reply to Mohammed Rafi KC from comment #10)
> I tried to reproduce this bug locally, though I couldn't create a systemic
> setup and the same workload. I tried with the normal IO's with 2*2 and 2
> clients.
> 
> I was not successful in reproducing this bug. Since we don't have a proper
> reproducer i'm not planning to invest much time on this.
> 
> So I think best for now is to close this bug and if you see the problem
> again in latest systematic or non-systematic set, we can reopen this bug.
> 
> Let me know your thoughts?

Agree with that, As I know you tried and spent enough time on reproducer. I will reopen this bug when I hit it again. You can go ahead and do the necessary.