Description of problem: I had a testsetup with nfs-ganesha running on 4 nodes with HA capabilities. Now, I start I/O from one of the nodes and finds that I/O is moving ahead after sometime. This is because the nfs-ganesha process is not responding back, whereas the server process is still running and hence the failover is also not happening. Altogether the I/O gets stuck. Version-Release number of selected component (if applicable): glusterfs-3.7.5-1.el7.x86_64 nfs-ganesha-2.3-0.rc6.el7.centos.x86_64 How reproducible: happens in first instance of execution Steps to Reproduce: 1. setup a 4 node cluster of glusterfs and 4 node nfs-ganesha front end 2. mount the volume over nfs-ganesha with vers=4 on a client 3. start executing arequal tool on the mount-point Actual results: step 3 result, the I/O is stuck and nfs-ganesha is not responding, even a showmount on the nfs-ganesha server results in rpc-timeout, # showmount -e localhost rpc mount export: RPC: Timed out the strace on nfs-ganesha does not move beyond to display the calls, # strace -p 19773 Process 19773 attached futex(0x7efde8b819d0, FUTEX_WAIT, 19803, NULL^CProcess 19773 detached <detached ...> The ganesha.log says, 17/10/2015 01:48:58 : epoch 56215bb1 : vm1 : ganesha.nfsd-19773[main] nfs_Start_threads :THREAD :EVENT :General fridge was started successfully 17/10/2015 01:48:58 : epoch 56215bb1 : vm1 : ganesha.nfsd-19773[main] nfs_start :NFS STARTUP :EVENT :------------------------------------------------- 17/10/2015 01:48:58 : epoch 56215bb1 : vm1 : ganesha.nfsd-19773[main] nfs_start :NFS STARTUP :EVENT : NFS SERVER INITIALIZED 17/10/2015 01:48:58 : epoch 56215bb1 : vm1 : ganesha.nfsd-19773[main] nfs_start :NFS STARTUP :EVENT :------------------------------------------------- 17/10/2015 01:49:58 : epoch 56215bb1 : vm1 : ganesha.nfsd-19773[reaper] nfs_in_grace :STATE :EVENT :NFS Server Now NOT IN GRACE 17/10/2015 01:52:01 : epoch 56215bb1 : vm1 : ganesha.nfsd-19773[dbus_heartbeat] glusterfs_create_export :FSAL :EVENT :Volume vol2 exported at : '/' 17/10/2015 03:49:17 : epoch 56215bb1 : vm1 : ganesha.nfsd-19773[dbus_heartbeat] dbus_heartbeat_cb :DBUS :WARN :Health status is unhealthy. Not sending heartbeat 17/10/2015 03:50:33 : epoch 56215bb1 : vm1 : ganesha.nfsd-19773[dbus_heartbeat] dbus_heartbeat_cb :DBUS :WARN :Health status is unhealthy. Not sending heartbeat 17/10/2015 03:53:48 : epoch 56215bb1 : vm1 : ganesha.nfsd-19773[dbus_heartbeat] dbus_heartbeat_cb :DBUS :WARN :Health status is unhealthy. Not sending heartbeat 17/10/2015 03:55:03 : epoch 56215bb1 : vm1 : ganesha.nfsd-19773[dbus_heartbeat] dbus_heartbeat_cb :DBUS :WARN :Health status is unhealthy. Not sending heartbeat Finally the failover does not happen and the I/O does not move ahead Expected results: nfs-ganesha should respond back or get killed, as in case of process not running, HA capabilities can be used and I/O shall move ahead Additional info:
This bug is getting closed because GlusteFS-3.7 has reached its end-of-life. Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS. If this bug still exists in newer GlusterFS releases, please reopen this bug against the newer release.