Bug 1222409

Summary:	nfs-ganesha: HA failover happens but I/O does not move ahead when volume has two mounts and I/O going on both mounts
Product:	[Community] GlusterFS	Reporter:	Saurabh <saujain>
Component:	ganesha-nfs	Assignee:	Soumya Koduri <skoduri>
Status:	CLOSED EOL	QA Contact:
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	3.7.0	CC:	kkeithle, mzywusko, ndevos, pasik, sankarshan, skoduri, smohan
Target Milestone:	---
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-03-08 11:01:40 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Saurabh 2015-05-18 07:26:02 UTC

Description of problem:
The problem is that the I/O does not move ahead even though as "pcs status" command the nfs-ganesha process has failed over to some other node.

The two mounts for the same volume(say vol2) were done using the vers=3 on a client. The mounts were done using the virtual IP. Now, iozone was started on both mount points and on one of the server the nfs-ganehsa process was stopped. So, after a certain grace period on cluster the I/O should have started moving but that has not happened in this case.

[root@nfs1 ~]# gluster volume status
Status of volume: gluster_shared_storage
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.37.148:/rhs/brick1/d1r1-share   49156     0          Y       3549 
Brick 10.70.37.77:/rhs/brick1/d1r2-share    49155     0          Y       3329 
Brick 10.70.37.76:/rhs/brick1/d2r1-share    49155     0          Y       3081 
Brick 10.70.37.69:/rhs/brick1/d2r2-share    49155     0          Y       3346 
Brick 10.70.37.148:/rhs/brick1/d3r1-share   49157     0          Y       3566 
Brick 10.70.37.77:/rhs/brick1/d3r2-share    49156     0          Y       3346 
Brick 10.70.37.76:/rhs/brick1/d4r1-share    49156     0          Y       3098 
Brick 10.70.37.69:/rhs/brick1/d4r2-share    49156     0          Y       3363 
Brick 10.70.37.148:/rhs/brick1/d5r1-share   49158     0          Y       3583 
Brick 10.70.37.77:/rhs/brick1/d5r2-share    49157     0          Y       3363 
Brick 10.70.37.76:/rhs/brick1/d6r1-share    49157     0          Y       3115 
Brick 10.70.37.69:/rhs/brick1/d6r2-share    49157     0          Y       3380 
Self-heal Daemon on localhost               N/A       N/A        Y       28128
Self-heal Daemon on 10.70.37.69             N/A       N/A        Y       30533
Self-heal Daemon on 10.70.37.77             N/A       N/A        Y       16037
Self-heal Daemon on 10.70.37.76             N/A       N/A        Y       6128 
 
Task Status of Volume gluster_shared_storage
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: vol2
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.37.148:/rhs/brick1/d1r1         49153     0          Y       28060
Brick 10.70.37.77:/rhs/brick1/d1r2          49152     0          Y       15975
Brick 10.70.37.76:/rhs/brick1/d2r1          49152     0          Y       6068 
Brick 10.70.37.69:/rhs/brick1/d2r2          49152     0          Y       30472
Brick 10.70.37.148:/rhs/brick1/d3r1         49154     0          Y       28077
Brick 10.70.37.77:/rhs/brick1/d3r2          49153     0          Y       15992
Brick 10.70.37.76:/rhs/brick1/d4r1          49153     0          Y       6085 
Brick 10.70.37.69:/rhs/brick1/d4r2          49153     0          Y       30489
Brick 10.70.37.148:/rhs/brick1/d5r1         49155     0          Y       28094
Brick 10.70.37.77:/rhs/brick1/d5r2          49154     0          Y       16009
Brick 10.70.37.76:/rhs/brick1/d6r1          49154     0          Y       6102 
Brick 10.70.37.69:/rhs/brick1/d6r2          49154     0          Y       30506
Self-heal Daemon on localhost               N/A       N/A        Y       28128
Self-heal Daemon on 10.70.37.69             N/A       N/A        Y       30533
Self-heal Daemon on 10.70.37.77             N/A       N/A        Y       16037
Self-heal Daemon on 10.70.37.76             N/A       N/A        Y       6128 
 
Task Status of Volume vol2
------------------------------------------------------------------------------
There are no active volume tasks


status of nfs-ganesha on all four nodes,
nfs1
====
root      3790     1  0 May13 ?        00:00:09 /usr/sbin/glusterfs --volfile-server=nfs1 --volfile-id=/gluster_shared_storage /var/run/gluster/shared_storage
---
nfs2
====
root      3300     1  0 May13 ?        00:00:09 /usr/sbin/glusterfs --volfile-server=nfs1 --volfile-id=/gluster_shared_storage /var/run/gluster/shared_storage
root     11003     1  0 May15 ?        00:01:08 /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT -p /var/run/ganesha.nfsd.pid
---
nfs3
====
root      3577     1  0 May13 ?        00:00:08 /usr/sbin/glusterfs --volfile-server=nfs1 --volfile-id=/gluster_shared_storage /var/run/gluster/shared_storage
root      4195     1  0 May15 ?        00:01:08 /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT -p /var/run/ganesha.nfsd.pid
---
nfs4
====
root     14760     1  0 May15 ?        00:00:04 /usr/sbin/glusterfs --volfile-server=nfs1 --volfile-id=/gluster_shared_storage /var/run/gluster/shared_storage
root     23970     1  0 May15 ?        00:02:17 /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT -p /var/run/ganesha.nfsd.pid


pcs status; this clearly shows that the failover of nfs-ganesha process running on nfs1 happened on nfs4

[root@nfs1 ~]# pcs status
Cluster name: ganesha-ha-360
Last updated: Mon May 18 12:54:12 2015
Last change: Fri May 15 19:25:20 2015
Stack: cman
Current DC: nfs1 - partition with quorum
Version: 1.1.11-97629de
4 Nodes configured
17 Resources configured


Online: [ nfs1 nfs2 nfs3 nfs4 ]

Full list of resources:

 Clone Set: nfs-mon-clone [nfs-mon]
     Started: [ nfs1 nfs2 nfs3 nfs4 ]
 Clone Set: nfs-grace-clone [nfs-grace]
     Started: [ nfs1 nfs2 nfs3 nfs4 ]
 nfs1-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs4 
 nfs1-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs4 
 nfs2-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs2 
 nfs2-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs2 
 nfs3-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs3 
 nfs3-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs3 
 nfs4-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs4 
 nfs4-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs4 
 nfs1-dead_ip-1	(ocf::heartbeat:Dummy):	Started nfs1 


Version-Release number of selected component (if applicable):
glusterfs-3.7.0beta2-0.0.el6.x86_64
nfs-ganesha-2.2.0-0.el6.x86_64

How reproducible:
Happens in first attempt itself

Steps to Reproduce:
1. create a volume of type 6x2, start it
2. start nfs-ganesha on all nodes in consideration, after doing the pre-requisites
3. mount the volume on a client on two mount-points, using different virtual IPs.
4. on one server, execute servce nfs-ganesha stop
5. wait for grace_period to finish and let I/O resume

Actual results:
step 4 result,
I/O does not resume as expected,


Expected results:
In this case I/O should resume

Additional info:

Comment 1 Meghana 2015-05-18 09:09:17 UTC

Hi Saurabh,
This is clearly a multi-head case, and the required patch is not merged in NFS-Ganesha upstream yet.
https://review.gerrithub.io/#/c/228614/

Please try this case again as soon as the patch goes in. Nonetheless, please attach the sos reports to capture the current state for future reference.

Comment 2 Saurabh 2015-05-19 04:16:43 UTC

Well, I tried to give a re-run to this test with a slight modification, earlier I had two different directories and I was executing iozone in both of them separately from different mounts as "iozone -a". Now, the modification is that I am giving different file names for iozone as option. 

The result with this modification is that iozone has on both mount-point has finished but nfs-ganesha process has got killed on two other nodes by itself, which is also a cause of worry.

The present pcs status,

[root@nfs1 ~]# pcs status
Cluster name: ganesha-ha-360
Last updated: Tue May 19 09:39:54 2015
Last change: Mon May 18 20:29:02 2015
Stack: cman
Current DC: nfs1 - partition with quorum
Version: 1.1.11-97629de
4 Nodes configured
19 Resources configured


Online: [ nfs1 nfs2 nfs3 nfs4 ]

Full list of resources:

 Clone Set: nfs-mon-clone [nfs-mon]
     Started: [ nfs1 nfs2 nfs3 nfs4 ]
 Clone Set: nfs-grace-clone [nfs-grace]
     Started: [ nfs1 nfs2 nfs3 nfs4 ]
 nfs1-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs2 
 nfs1-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs2 
 nfs2-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs2 
 nfs2-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs2 
 nfs3-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs2 
 nfs3-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs2 
 nfs4-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs2 
 nfs4-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs2 
 nfs1-dead_ip-1	(ocf::heartbeat:Dummy):	Started nfs1 
 nfs4-dead_ip-1	(ocf::heartbeat:Dummy):	Started nfs4 
 nfs3-dead_ip-1	(ocf::heartbeat:Dummy):	Started nfs3 


nfs-ganesha process status on four nodes,
nfs1

stopped using "kill -s TERM <pid>"
---
nfs2
root     20620     1  0 May18 ?        00:00:50 /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT -p /var/run/ganesha.nfsd.pid
---
nfs3
Got killed itself
---
nfs4
Got killed itself
---

Comment 3 Saurabh 2015-05-19 04:46:53 UTC

As per the /var/log/messages/

on nfs4,
May 18 20:23:55 nfs4 lrmd[1640]:   notice: operation_finished: nfs-mon_monitor_10000:28779:stderr [ Error: Resource does not exist. ]
May 18 20:24:06 nfs4 lrmd[1640]:   notice: operation_finished: nfs-mon_monitor_10000:28927:stderr [ Error: Resource does not exist. ]
May 18 20:24:06 nfs4 kernel: ganesha.nfsd[754]: segfault at f8 ip 00007f6606ef31f6 sp 00007f65e19745c0 error 4 in libgfapi.so.0.0.0[7f6606ee9000+1e000]
May 18 20:24:10 nfs4 abrtd: Directory 'ccpp-2015-05-18-20:24:06-737' creation detected
May 18 20:24:10 nfs4 abrt[28944]: Saved core dump of pid 737 (/usr/bin/ganesha.nfsd) to /var/spool/abrt/ccpp-2015-05-18-20:24:06-737 (1279594496 bytes)
May 18 20:24:10 nfs4 abrtd: Package 'nfs-ganesha' isn't signed with proper key
May 18 20:24:10 nfs4 abrtd: 'post-create' on '/var/spool/abrt/ccpp-2015-05-18-20:24:06-737' exited with 1
May 18 20:24:10 nfs4 abrtd: Deleting problem directory '/var/spool/abrt/ccpp-2015-05-18-20:24:06-737'
May 18 20:24:16 nfs4 cibadmin[29085]:   notice: crm_log_args: Invoked: /usr/sbin/cibadmin --replace -o configuration -V --xml-pipe 
May 18 20:24:16 nfs4 crmd[1643]:   notice: process_lrm_event: Operation nfs4-dead_ip-1_monitor_0: not running (node=nfs4, call=194, rc=7, cib-update=128, confirmed=true)
May 18 20:24:17 nfs4 attrd[1641]:   notice: attrd_trigger_update: Sending flush op to all hosts for: ganesha-active (<null>)
May 18 20:24:17 nfs4 attrd[1641]:   notice: attrd_perform_update: Sent delete 62: node=nfs4, attr=ganesha-active, id=<n/a>, set=(null), section=status
May 18 20:24:17 nfs4 IPaddr(nfs1-cluster_ip-1)[29092]: INFO: IP status = ok, IP_CIP=
May 18 20:24:17 nfs4 crmd[1643]:   notice: process_lrm_event: Operation nfs4-dead_ip-1_start_0: ok (node=nfs4, call=199, rc=0, cib-update=129, confirmed=true)
May 18 20:24:17 nfs4 crmd[1643]:   notice: process_lrm_event: Operation nfs1-cluster_ip-1_stop_0: ok (node=nfs4, call=196, rc=0, cib-update=130, confirmed=true)
May 18 20:24:17 nfs4 crmd[1643]:   notice: process_lrm_event: Operation nfs4-dead_ip-1_monitor_10000: ok (node=nfs4, call=200, rc=0, cib-update=131, confirmed=false)
May 18 20:24:17 nfs4 IPaddr(nfs4-cluster_ip-1)[29093]: INFO: IP status = ok, IP_CIP=
May 18 20:24:17 nfs4 crmd[1643]:   notice: process_lrm_event: Operation nfs4-cluster_ip-1_stop_0: ok (node=nfs4, call=198, rc=0, cib-update=132, confirmed=true)
May 18 20:24:17 nfs4 crmd[1643]:   notice: process_lrm_event: Operation nfs-grace_stop_0: ok (node=nfs4, call=202, rc=0, cib-update=133, confirmed=true)
May 18 20:24:17 nfs4 crmd[1643]:   notice: process_lrm_event: Operation nfs1-trigger_ip-1_stop_0: ok (node=nfs4, call=204, rc=0, cib-update=134, confirmed=true)
May 18 20:24:17 nfs4 crmd[1643]:   notice: process_lrm_event: Operation nfs4-trigger_ip-1_stop_0: ok (node=nfs4, call=206, rc=0, cib-update=135, confirmed=true)
May 18 20:24:17 nfs4 crmd[1643]:   notice: process_lrm_event: Operation nfs-grace_start_0: ok (node=nfs4, call=207, rc=0, cib-update=136, confirmed=true)
May 18 20:24:17 nfs4 crmd[1643]:   notice: process_lrm_event: Operation nfs-grace_monitor_5000: ok (node=nfs4, call=208, rc=0, cib-update=137, confirmed=false)
May 18 20:24:19 nfs4 ntpd[1707]: Deleting interface #41 eth0, 10.70.36.217#123, interface stats: received=0, sent=0, dropped=0, active_time=2551 secs
May 18 20:24:19 nfs4 ntpd[1707]: Deleting interface #40 eth0, 10.70.36.220#123, interface stats: received=0, sent=0, dropped=0, active_time=2731 secs
May 18 20:24:19 nfs4 ntpd[1707]: peers refreshed




and on nfs3,
May 18 20:28:57 nfs3 kernel: ganesha.nfsd[3490]: segfault at 5c0000c0 ip 0000003b278093a0 sp 00007f3a08e88598 error 4 in libpthread-2.12.so[3b27800000+17000]
May 18 20:29:00 nfs3 abrtd: Directory 'ccpp-2015-05-18-20:28:57-3477' creation detected
May 18 20:29:00 nfs3 abrt[27506]: Saved core dump of pid 3477 (/usr/bin/ganesha.nfsd) to /var/spool/abrt/ccpp-2015-05-18-20:28:57-3477 (837488640 bytes)
May 18 20:29:00 nfs3 abrtd: Package 'nfs-ganesha' isn't signed with proper key
May 18 20:29:00 nfs3 abrtd: 'post-create' on '/var/spool/abrt/ccpp-2015-05-18-20:28:57-3477' exited with 1
May 18 20:29:00 nfs3 abrtd: Deleting problem directory '/var/spool/abrt/ccpp-2015-05-18-20:28:57-3477'
May 18 20:29:02 nfs3 cibadmin[27553]:   notice: crm_log_args: Invoked: /usr/sbin/cibadmin --replace -o configuration -V --xml-pipe 
May 18 20:29:02 nfs3 crmd[5607]:   notice: process_lrm_event: Operation nfs3-dead_ip-1_monitor_0: not running (node=nfs3, call=186, rc=7, cib-update=121, confirmed=true)
May 18 20:29:03 nfs3 attrd[5605]:   notice: attrd_trigger_update: Sending flush op to all hosts for: ganesha-active (<null>)
May 18 20:29:03 nfs3 attrd[5605]:   notice: attrd_perform_update: Sent delete 77: node=nfs3, attr=ganesha-active, id=<n/a>, set=(null), section=status
May 18 20:29:03 nfs3 crmd[5607]:   notice: process_lrm_event: Operation nfs3-dead_ip-1_start_0: ok (node=nfs3, call=193, rc=0, cib-update=122, confirmed=true)
May 18 20:29:03 nfs3 crmd[5607]:   notice: process_lrm_event: Operation nfs3-dead_ip-1_monitor_10000: ok (node=nfs3, call=194, rc=0, cib-update=123, confirmed=false)
May 18 20:29:03 nfs3 IPaddr(nfs3-cluster_ip-1)[27561]: INFO: IP status = ok, IP_CIP=
May 18 20:29:03 nfs3 crmd[5607]:   notice: process_lrm_event: Operation nfs3-cluster_ip-1_stop_0: ok (node=nfs3, call=190, rc=0, cib-update=124, confirmed=true)
May 18 20:29:03 nfs3 IPaddr(nfs1-cluster_ip-1)[27560]: INFO: IP status = ok, IP_CIP=
May 18 20:29:03 nfs3 crmd[5607]:   notice: process_lrm_event: Operation nfs1-cluster_ip-1_stop_0: ok (node=nfs3, call=188, rc=0, cib-update=125, confirmed=true)
May 18 20:29:03 nfs3 IPaddr(nfs4-cluster_ip-1)[27562]: INFO: IP status = ok, IP_CIP=
May 18 20:29:03 nfs3 crmd[5607]:   notice: process_lrm_event: Operation nfs4-cluster_ip-1_stop_0: ok (node=nfs3, call=192, rc=0, cib-update=126, confirmed=true)
May 18 20:29:03 nfs3 crmd[5607]:   notice: process_lrm_event: Operation nfs-grace_stop_0: ok (node=nfs3, call=196, rc=0, cib-update=127, confirmed=true)
May 18 20:29:03 nfs3 crmd[5607]:   notice: process_lrm_event: Operation nfs4-trigger_ip-1_stop_0: ok (node=nfs3, call=202, rc=0, cib-update=128, confirmed=true)
May 18 20:29:03 nfs3 crmd[5607]:   notice: process_lrm_event: Operation nfs3-trigger_ip-1_stop_0: ok (node=nfs3, call=200, rc=0, cib-update=129, confirmed=true)
May 18 20:29:04 nfs3 crmd[5607]:   notice: process_lrm_event: Operation nfs1-trigger_ip-1_stop_0: ok (node=nfs3, call=198, rc=0, cib-update=130, confirmed=true)
May 18 20:29:04 nfs3 crmd[5607]:   notice: process_lrm_event: Operation nfs-grace_start_0: ok (node=nfs3, call=203, rc=0, cib-update=131, confirmed=true)
May 18 20:29:04 nfs3 crmd[5607]:   notice: process_lrm_event: Operation nfs-grace_monitor_5000: ok (node=nfs3, call=204, rc=0, cib-update=132, confirmed=false)
May 18 20:29:04 nfs3 ntpd[1749]: Deleting interface #34 eth0, 10.70.36.217#123, interface stats: received=0, sent=0, dropped=0, active_time=285 secs
May 18 20:29:04 nfs3 ntpd[1749]: Deleting interface #33 eth0, 10.70.36.220#123, interface stats: received=0, sent=0, dropped=0, active_time=285 secs
May 18 20:29:04 nfs3 ntpd[1749]: Deleting interface #31 eth0, 10.70.36.219#123, interface stats: received=0, sent=0, dropped=0, active_time=3027 secs
May 18 20:29:04 nfs3 ntpd[1749]: peers refreshed

Comment 4 Meghana 2015-05-19 04:52:32 UTC

Saurabh, please paste the last few lines of gfapi.log that we saw.

Comment 5 Saurabh 2015-05-19 05:48:10 UTC

as per gfapi.log from nfs3,

[2015-05-18 14:08:22.308125] I [client-handshake.c:187:client_set_lk_version_cbk] 0-vol2-client-7: Server lk version = 1
[2015-05-18 14:08:22.319205] I [afr-common.c:1673:afr_local_discovery_cbk] 0-vol2-replicate-3: selecting local read_child vol2-client-7
[2015-05-18 14:08:22.319249] I [afr-common.c:1673:afr_local_discovery_cbk] 0-vol2-replicate-1: selecting local read_child vol2-client-3
[2015-05-18 14:08:22.319280] I [afr-common.c:1673:afr_local_discovery_cbk] 0-vol2-replicate-5: selecting local read_child vol2-client-11
[2015-05-18 14:08:22.319378] I [MSGID: 104041] [glfs-resolve.c:843:__glfs_active_subvol] 0-vol2: switched to graph 6e667333-2d33-3437-372d-323031352d30 (0)
[2015-05-18 14:54:55.598244] W [client-callback.c:65:client_cbk_cache_invalidation] 0-vol2-client-11: XDR decode of cache_invalidation failed.
[2015-05-18 14:54:59.302948] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-11: remote operation failed: Stale file handle
[2015-05-18 14:54:59.303540] W [MSGID: 108008] [afr-read-txn.c:237:afr_read_txn] 0-vol2-replicate-5: Unreadable subvolume -1 found with event generation 2. (Possible split-brain)
[2015-05-18 14:55:19.427492] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-11: remote operation failed: Stale file handle
[2015-05-18 14:56:00.410174] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-11: remote operation failed: Stale file handle
[2015-05-18 14:56:17.207031] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-11: remote operation failed: Stale file handle
The message "W [MSGID: 108008] [afr-read-txn.c:237:afr_read_txn] 0-vol2-replicate-5: Unreadable subvolume -1 found with event generation 2. (Possible split-brain)" repeated 3 times between [2015-05-18 14:54:59.303540] and [2015-05-18 14:56:17.219316]
[2015-05-18 14:56:54.106281] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-11: remote operation failed: Stale file handle
[2015-05-18 14:56:54.126517] W [MSGID: 108008] [afr-read-txn.c:237:afr_read_txn] 0-vol2-replicate-5: Unreadable subvolume -1 found with event generation 2. (Possible split-brain)
[2015-05-18 14:56:54.283049] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-11: remote operation failed: Stale file handle
[2015-05-18 14:57:54.191362] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-11: remote operation failed: Stale file handle
[2015-05-18 14:58:08.095779] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-11: remote operation failed: Stale file handle
[2015-05-18 14:58:26.778498] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-11: remote operation failed: Stale file handle
The message "W [MSGID: 108008] [afr-read-txn.c:237:afr_read_txn] 0-vol2-replicate-5: Unreadable subvolume -1 found with event generation 2. (Possible split-brain)" repeated 4 times between [2015-05-18 14:56:54.126517] and [2015-05-18 14:58:26.780169]
[2015-05-18 14:58:39.474909] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-11: remote operation failed: Stale file handle
[2015-05-18 14:58:39.475858] W [MSGID: 108008] [afr-read-txn.c:237:afr_read_txn] 0-vol2-replicate-5: Unreadable subvolume -1 found with event generation 2. (Possible split-brain)


from gfapi.log on nfs4,
The message "W [MSGID: 108008] [afr-read-txn.c:237:afr_read_txn] 0-vol2-replicate-5: Unreadable subvolume -1 found with event generation 12. (Possible split-brain)" repeated 5 times between [2015-05-18 14:50:58.054198] and [2015-05-18 14:52:26.797085]
[2015-05-18 14:53:13.243341] W [client-callback.c:65:client_cbk_cache_invalidation] 0-vol2-client-10: XDR decode of cache_invalidation failed.
[2015-05-18 14:53:17.390477] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-10: remote operation failed: Stale file handle
[2015-05-18 14:53:17.395473] W [MSGID: 108008] [afr-read-txn.c:237:afr_read_txn] 0-vol2-replicate-5: Unreadable subvolume -1 found with event generation 12. (Possible split-brain)
[2015-05-18 14:53:17.481876] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-11: remote operation failed: Stale file handle
[2015-05-18 14:53:51.528836] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-11: remote operation failed: Stale file handle
[2015-05-18 14:53:56.245641] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-10: remote operation failed: Stale file handle

Comment 6 Meghana 2015-05-19 06:45:23 UTC

I saw cache_invalidation related errors in gfapi.log. This is what I did on my 
4 node set up.

1. Cloned gluster master ( no fix has gone except for Disable_ACL) and installed it on all the machines.
2. Cloned NFS-Ganesha-2.3-dev-3 and applied Soumya's patch that is waiting to merged on NFS-Ganesha upstream,
https://review.gerrithub.io/#/c/228614/
and installed NFS-Ganesha on all the nodes.

My four node set up has a volume testvol, that looks like this,
Volume Name: testvol
Type: Distribute
Volume ID: b021c59b-16ad-4eaf-9fc3-d0eb563a9ab0
Status: Started
Number of Bricks: 4
Transport-type: tcp
Bricks:
Brick1: rhs1:/brick/brick1
Brick2: rhs2:/brick/brick1
Brick3: rhs3:/brick/brick1
Brick4: rhs4:/brick/brick1
Options Reconfigured:
ganesha.enable: on
features.cache-invalidation: on
nfs.disable: on
performance.readdir-ahead: on
nfs-ganesha: enable


VIP of rhs1 : 10.70.40.173
VIP of rhs2 : 10.70.40. 174

I mounted the volume "testvol" on two different clients.

Client 1 :

10.70.40.173:/testvol on /mnt type nfs (rw,vers=4,addr=10.70.40.173,clientaddr=10.70.43.78)

Client 2:
10.70.40.174:/testvol on /mnt type nfs (rw,vers=4,addr=10.70.40.174,clientaddr=10.70.42.117)


And I created two directories dir and dir1 on the mount point.

On client1, I ran iozone by executing the following command,
cd /mnt/dir 
time iozone -a -f rhs1.ioz

On client2, I ran iozone by executing the following command,
time iozone -a -f rhs3.ioz

After a few seconds, I killed ganesha on rhs1 ( VIP 10.70.40.173 ),
I saw a delay for a minute and I/Os resumed after that.

pcs status looked like this,

Online: [ rhs1 rhs2 rhs3 rhs4 ]

Full list of resources:

 Clone Set: nfs-mon-clone [nfs-mon]
     Started: [ rhs1 rhs2 rhs3 rhs4 ]
 Clone Set: nfs-grace-clone [nfs-grace]
     Started: [ rhs1 rhs2 rhs3 rhs4 ]
 rhs1-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started rhs4 
 rhs1-trigger_ip-1	(ocf::heartbeat:Dummy):	Started rhs4 
 rhs2-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started rhs2 
 rhs2-trigger_ip-1	(ocf::heartbeat:Dummy):	Started rhs2 
 rhs3-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started rhs3 
 rhs3-trigger_ip-1	(ocf::heartbeat:Dummy):	Started rhs3 
 rhs4-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started rhs4 
 rhs4-trigger_ip-1	(ocf::heartbeat:Dummy):	Started rhs4 
 rhs1-dead_ip-1	(ocf::heartbeat:Dummy):	Started rhs1 


rhs1 had failed over to rhs4.

After around 9 minutes, iozone was successfully completed on both the clients.

Client 1 :
iozone test complete.

real	9m46.894s
user	0m3.679s
sys	2m3.828s


Client 2 :
iozone test complete.

real	7m9.126s
user	0m2.482s
sys	0m56.361s


After the tests finished, I checked the pcs status again,
Online: [ rhs1 rhs2 rhs3 rhs4 ]

Full list of resources:

 Clone Set: nfs-mon-clone [nfs-mon]
     Started: [ rhs1 rhs2 rhs3 rhs4 ]
 Clone Set: nfs-grace-clone [nfs-grace]
     Started: [ rhs1 rhs2 rhs3 rhs4 ]
 rhs1-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started rhs4 
 rhs1-trigger_ip-1	(ocf::heartbeat:Dummy):	Started rhs4 
 rhs2-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started rhs2 
 rhs2-trigger_ip-1	(ocf::heartbeat:Dummy):	Started rhs2 
 rhs3-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started rhs3 
 rhs3-trigger_ip-1	(ocf::heartbeat:Dummy):	Started rhs3 
 rhs4-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started rhs4 
 rhs4-trigger_ip-1	(ocf::heartbeat:Dummy):	Started rhs4 
 rhs1-dead_ip-1	(ocf::heartbeat:Dummy):	Started rhs1 

NFS-Ganesha was running on all the other three nodes.

Comment 7 Meghana 2015-07-13 05:09:01 UTC

As I was able to run it successfully, I request you to try it with the new builds.

Comment 9 Soumya Koduri 2015-09-04 12:14:33 UTC

Saurabh,

Could you confirm if you are still hitting this issue.

Comment 11 Kaushal 2017-03-08 11:01:40 UTC

This bug is getting closed because GlusteFS-3.7 has reached its end-of-life.

Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS.
If this bug still exists in newer GlusterFS releases, please reopen this bug against the newer release.

Comment 12 Red Hat Bugzilla 2023-09-14 02:59:16 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days