Bug 1823706
| Summary: | [Ganesha] HA cluster status shows "FAILOVER" even when all nodes are up and running | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Manisha Saini <msaini> |
| Component: | common-ha | Assignee: | Kaleb KEITHLEY <kkeithle> |
| Status: | CLOSED ERRATA | QA Contact: | Manisha Saini <msaini> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | rhgs-3.5 | CC: | dang, grajoria, jthottan, kkeithle, mbenjamin, pasik, pprakash, puebele, rhs-bugs, rkothiya, sheggodu, skoduri, storage-qa-internal |
| Target Milestone: | --- | Keywords: | ZStream |
| Target Release: | RHGS 3.5.z Batch Update 2 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | glusterfs-6.0-34 | Doc Type: | No Doc Update |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-06-16 06:19:39 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
fixed in https://review.gluster.org/24333 commit 0abdd69636c42ec410a0615763f5c2ca4dca8f75 Change-Id: If2aa1e7b53c766c625d7b4cc222a83ea2c0bd72d
HA status still showing as FAILOVER
# rpm -qa | grep ganesha
nfs-ganesha-debugsource-2.7.3-15.el8rhgs.x86_64
nfs-ganesha-gluster-debuginfo-2.7.3-15.el8rhgs.x86_64
glusterfs-ganesha-6.0-33.el8rhgs.x86_64
nfs-ganesha-debuginfo-2.7.3-15.el8rhgs.x86_64
nfs-ganesha-2.7.3-15.el8rhgs.x86_64
nfs-ganesha-selinux-2.7.3-15.el8rhgs.noarch
nfs-ganesha-gluster-2.7.3-15.el8rhgs.x86_64
# /usr/libexec/ganesha/ganesha-ha.sh --status /var/run/gluster/shared_storage/nfs-ganesha
* Online: [ dhcp35-21.lab.eng.blr.redhat.com dhcp35-63.lab.eng.blr.redhat.com dhcp35-76.lab.eng.blr.redhat.com dhcp35-134.lab.eng.blr.redhat.com ]
Cluster HA Status: FAILOVER
#pcs status
Cluster name: ganesha-ha-360
Cluster Summary:
* Stack: corosync
* Current DC: dhcp35-76.lab.eng.blr.redhat.com (version 2.0.3-5.el8-4b1f869f0f) - partition with quorum
* Last updated: Mon May 4 04:49:15 2020
* Last change: Mon May 4 04:47:36 2020 by root via cibadmin on dhcp35-76.lab.eng.blr.redhat.com
* 4 nodes configured
* 24 resource instances configured
Node List:
* Online: [ dhcp35-21.lab.eng.blr.redhat.com dhcp35-63.lab.eng.blr.redhat.com dhcp35-76.lab.eng.blr.redhat.com dhcp35-134.lab.eng.blr.redhat.com ]
Full List of Resources:
* Clone Set: nfs_setup-clone [nfs_setup]:
* Started: [ dhcp35-21.lab.eng.blr.redhat.com dhcp35-63.lab.eng.blr.redhat.com dhcp35-76.lab.eng.blr.redhat.com dhcp35-134.lab.eng.blr.redhat.com ]
* Clone Set: nfs-mon-clone [nfs-mon]:
* Started: [ dhcp35-21.lab.eng.blr.redhat.com dhcp35-63.lab.eng.blr.redhat.com dhcp35-76.lab.eng.blr.redhat.com dhcp35-134.lab.eng.blr.redhat.com ]
* Clone Set: nfs-grace-clone [nfs-grace]:
* Started: [ dhcp35-21.lab.eng.blr.redhat.com dhcp35-63.lab.eng.blr.redhat.com dhcp35-76.lab.eng.blr.redhat.com dhcp35-134.lab.eng.blr.redhat.com ]
* Resource Group: dhcp35-76.lab.eng.blr.redhat.com-group:
* dhcp35-76.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp35-76.lab.eng.blr.redhat.com
* dhcp35-76.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp35-76.lab.eng.blr.redhat.com
* dhcp35-76.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp35-76.lab.eng.blr.redhat.com
* Resource Group: dhcp35-21.lab.eng.blr.redhat.com-group:
* dhcp35-21.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp35-21.lab.eng.blr.redhat.com
* dhcp35-21.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp35-21.lab.eng.blr.redhat.com
* dhcp35-21.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp35-21.lab.eng.blr.redhat.com
* Resource Group: dhcp35-63.lab.eng.blr.redhat.com-group:
* dhcp35-63.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp35-63.lab.eng.blr.redhat.com
* dhcp35-63.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp35-63.lab.eng.blr.redhat.com
* dhcp35-63.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp35-63.lab.eng.blr.redhat.com
* Resource Group: dhcp35-134.lab.eng.blr.redhat.com-group:
* dhcp35-134.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp35-134.lab.eng.blr.redhat.com
* dhcp35-134.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp35-134.lab.eng.blr.redhat.com
* dhcp35-134.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp35-134.lab.eng.blr.redhat.com
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2572 |
Description of problem: ====================== HA cluster status shows status as "FAILOVER" even when all nodes are up and running in "pcs status" and none of the VIP's are in failover state ------ # /usr/libexec/ganesha/ganesha-ha.sh --status /var/run/gluster/shared_storage/nfs-ganesha Cluster HA Status: FAILOVER ------- # pcs status Cluster name: ganesha-ha-360 Cluster Summary: * Stack: corosync * Current DC: dhcp35-76.lab.eng.blr.redhat.com (version 2.0.3-5.el8-4b1f869f0f) - partition with quorum * Last updated: Tue Apr 14 06:11:09 2020 * Last change: Mon Apr 13 11:59:22 2020 by root via cibadmin on dhcp35-76.lab.eng.blr.redhat.com * 4 nodes configured * 24 resource instances configured Node List: * Online: [ dhcp35-21.lab.eng.blr.redhat.com dhcp35-63.lab.eng.blr.redhat.com dhcp35-76.lab.eng.blr.redhat.com dhcp35-134.lab.eng.blr.redhat.com ] Full List of Resources: * Clone Set: nfs_setup-clone [nfs_setup]: * Started: [ dhcp35-21.lab.eng.blr.redhat.com dhcp35-63.lab.eng.blr.redhat.com dhcp35-76.lab.eng.blr.redhat.com dhcp35-134.lab.eng.blr.redhat.com ] * Clone Set: nfs-mon-clone [nfs-mon]: * Started: [ dhcp35-21.lab.eng.blr.redhat.com dhcp35-63.lab.eng.blr.redhat.com dhcp35-76.lab.eng.blr.redhat.com dhcp35-134.lab.eng.blr.redhat.com ] * Clone Set: nfs-grace-clone [nfs-grace]: * Started: [ dhcp35-21.lab.eng.blr.redhat.com dhcp35-63.lab.eng.blr.redhat.com dhcp35-76.lab.eng.blr.redhat.com dhcp35-134.lab.eng.blr.redhat.com ] * Resource Group: dhcp35-76.lab.eng.blr.redhat.com-group: * dhcp35-76.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp35-76.lab.eng.blr.redhat.com * dhcp35-76.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp35-76.lab.eng.blr.redhat.com * dhcp35-76.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp35-76.lab.eng.blr.redhat.com * Resource Group: dhcp35-21.lab.eng.blr.redhat.com-group: * dhcp35-21.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp35-21.lab.eng.blr.redhat.com * dhcp35-21.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp35-21.lab.eng.blr.redhat.com * dhcp35-21.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp35-21.lab.eng.blr.redhat.com * Resource Group: dhcp35-63.lab.eng.blr.redhat.com-group: * dhcp35-63.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp35-63.lab.eng.blr.redhat.com * dhcp35-63.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp35-63.lab.eng.blr.redhat.com * dhcp35-63.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp35-63.lab.eng.blr.redhat.com * Resource Group: dhcp35-134.lab.eng.blr.redhat.com-group: * dhcp35-134.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp35-134.lab.eng.blr.redhat.com * dhcp35-134.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp35-134.lab.eng.blr.redhat.com * dhcp35-134.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp35-134.lab.eng.blr.redhat.com Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled ------------ Version-Release number of selected component (if applicable): ============================================================= # rpm -qa | grep ganesha nfs-ganesha-gluster-2.7.3-10.el8rhgs.x86_64 nfs-ganesha-debuginfo-2.7.3-10.el8rhgs.x86_64 nfs-ganesha-2.7.3-10.el8rhgs.x86_64 nfs-ganesha-selinux-2.7.3-10.el8rhgs.noarch nfs-ganesha-debugsource-2.7.3-10.el8rhgs.x86_64 nfs-ganesha-gluster-debuginfo-2.7.3-10.el8rhgs.x86_64 glusterfs-ganesha-6.0-32.el8rhgs.x86_64 How reproducible: ================ 2/2 Steps to Reproduce: ================== 1.Setup 4 node ganesha cluster via gdeploy Actual results: ============== "gluster nfs-ganesha enable" command completed successfully. All nodes came up and running.Pacemaker,corosync,pcsd and nfs-ganesha services are also running on all nodes. But HA status shows cluster is in "FAILOVER" state at the end of gdeploy deployment Expected results: ================ It should show cluster as "HEALTHY" when all nodes are up and running Additional info: ===============