This bug is to address subset of what has been requested above. Verifying if the VIPs are assigned to their respective nodes and in STARTED state shall confirm if the services nfs-ganesha, pacemaker/corosync etc are started and the node is healthy. As part of this BZ, this validation shall be added to '--status' option so that gdeploy can use it for cluster health check.
upstream mainline patch http://review.gluster.org/15882 posted for review.
Even if are the nodes are healthy, ganesha-ha.sh --status shows Cluster HA Status as Bad. http://pastebin.test.redhat.com/433564 Also, are we checking the pcs status of all the three processes(nfs-block, cluster-ip, nfs-unblock) of each node to be in Started state?
please try http://review.gluster.org/#/c/15882/5/extras/ganesha/scripts/ganesha-ha.sh
If nodes are in failover state, the status output shows HA status as BAD instead of FAILOVER and the failover node and VIP is not printed in the output. [root@dhcp46-42 ~]# /usr/libexec/ganesha/ganesha-ha.sh --status Online: [ dhcp46-101.lab.eng.blr.redhat.com dhcp46-42.lab.eng.blr.redhat.com dhcp47-155.lab.eng.blr.redhat.com ] dhcp46-42.lab.eng.blr.redhat.com-cluster_ip-1 dhcp46-42.lab.eng.blr.redhat.com dhcp46-101.lab.eng.blr.redhat.com-cluster_ip-1 dhcp46-101.lab.eng.blr.redhat.com dhcp47-155.lab.eng.blr.redhat.com-cluster_ip-1 dhcp47-155.lab.eng.blr.redhat.com Cluster HA Status: BAD Updated the review with same comments.
upstream mainline : http://review.gluster.org/15882 release-3.9 : http://review.gluster.org/15991 release-3.8 : http://review.gluster.org/15992 downstream : https://code.engineering.redhat.com/gerrit/#/c/91878/
Verified the fix in build, glusterfs-ganesha-3.8.4-7.el7rhgs.x86_64 nfs-ganesha-2.4.1-2.el7rhgs.x86_64 nfs-ganesha-gluster-2.4.1-2.el7rhgs.x86_64 ganesha-ha.sh --status output: ------------------------------- [root@dhcp46-111 ~]# /usr/libexec/ganesha/ganesha-ha.sh --status /run/gluster/shared_storage/nfs-ganesha/ Online: [ dhcp46-111.lab.eng.blr.redhat.com dhcp46-115.lab.eng.blr.redhat.com dhcp46-124.lab.eng.blr.redhat.com ] dhcp46-111.lab.eng.blr.redhat.com-cluster_ip-1 dhcp46-111.lab.eng.blr.redhat.com dhcp46-115.lab.eng.blr.redhat.com-cluster_ip-1 dhcp46-115.lab.eng.blr.redhat.com dhcp46-139.lab.eng.blr.redhat.com-cluster_ip-1 dhcp46-115.lab.eng.blr.redhat.com dhcp46-124.lab.eng.blr.redhat.com-cluster_ip-1 dhcp46-124.lab.eng.blr.redhat.com Cluster HA Status: FAILOVER [root@dhcp46-111 ~]# /usr/libexec/ganesha/ganesha-ha.sh --status /run/gluster/shared_storage/nfs-ganesha/ Online: [ dhcp46-111.lab.eng.blr.redhat.com dhcp46-115.lab.eng.blr.redhat.com ] dhcp46-111.lab.eng.blr.redhat.com-cluster_ip-1 dhcp46-115.lab.eng.blr.redhat.com-cluster_ip-1 dhcp46-139.lab.eng.blr.redhat.com-cluster_ip-1 dhcp46-124.lab.eng.blr.redhat.com-cluster_ip-1 Cluster HA Status: BAD [root@dhcp46-115 ~]# /usr/libexec/ganesha/ganesha-ha.sh --status /run/gluster/shared_storage/nfs-ganesha/ Online: [ dhcp46-111.lab.eng.blr.redhat.com dhcp46-115.lab.eng.blr.redhat.com dhcp46-124.lab.eng.blr.redhat.com dhcp46-139.lab.eng.blr.redhat.com ] dhcp46-111.lab.eng.blr.redhat.com-cluster_ip-1 dhcp46-111.lab.eng.blr.redhat.com dhcp46-115.lab.eng.blr.redhat.com-cluster_ip-1 dhcp46-115.lab.eng.blr.redhat.com dhcp46-139.lab.eng.blr.redhat.com-cluster_ip-1 dhcp46-139.lab.eng.blr.redhat.com dhcp46-124.lab.eng.blr.redhat.com-cluster_ip-1 dhcp46-124.lab.eng.blr.redhat.com Cluster HA Status: HEALTHY
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0486.html