Bug 2026833 - [Ganesha][RHEL 8.5] HA status is in FAILOVER when configuring NFS ganesha with RHEL 8.5 platform
Summary: [Ganesha][RHEL 8.5] HA status is in FAILOVER when configuring NFS ganesha wit...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: common-ha
Version: rhgs-3.5
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
: RHGS 3.5.z Batch Update 7
Assignee: Kaleb KEITHLEY
QA Contact: Manisha Saini
URL:
Whiteboard:
: 2033272 (view as bug list)
Depends On:
Blocks: 2033272
TreeView+ depends on / blocked
 
Reported: 2021-11-26 05:27 UTC by SATHEESARAN
Modified: 2022-05-31 12:37 UTC (History)
9 users (show)

Fixed In Version: glusterfs-6.0-62
Doc Type: Bug Fix
Doc Text:
Previously, the `crmadmin` command waited forever or for 83 mins instead of timing out at 5 s, and glusterd waited for 2 mins for the setup command to complete before its own timeout. This is because `pacemaker-2.1.x` changed the semantics of the `--timeout` command line parameter for the `crmadmin` utility. The value was an integer that specified a timeout in milliseconds. With this update, the value is time specific, for example, 5 s, and defaults to seconds if the value is an integer. Now, the `crmadmin` command times out after 5 s as it did with the previous version of pacemaker.
Clone Of:
Environment:
Last Closed: 2022-05-31 12:37:31 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github gluster glusterfs issues 2997 0 None open HA status is in FAILOVER when configuring NFS ganesha with pacemaker 2.1 2021-12-01 12:51:26 UTC
Red Hat Product Errata RHBA-2022:4840 0 None None None 2022-05-31 12:37:46 UTC

Description SATHEESARAN 2021-11-26 05:27:58 UTC
Description of problem:
-----------------------
3 RHGS 3.5.5 nodes are installed ith RHGS 3.5.5 ISO based on RHEL 8.4. The nodes are subscribed to baseos, appstream, high-availability repos. The nodes are upgraded to RHEL 8.5. 

nfs-ganesha deployment fails in the step 'gluster nfs-ganesha enable'
and cluster HA status is FAILOVER
<snip>
TASK [Enable nfs-ganesha] ********************************************************************************************************************************************************************
fatal: [dhcp35-137.lab.eng.blr.redhat.com]: FAILED! => {"ansible_facts": {"discovered_interpreter_python": "/usr/libexec/platform-python"}, "changed": true, "cmd": "gluster nfs-ganesha enable --mode=script", "delta": "0:10:00.111919", "end": "2021-11-26 00:15:17.630279", "failed_when_result": true, "msg": "non-zero return code", "rc": 1, "start": "2021-11-26 00:05:17.518360", "stderr": "", "stderr_lines": [], "stdout": "This will take a few minutes to complete. Please wait ..\nError : Request timed out", "stdout_lines": ["This will take a few minutes to complete. Please wait ..", "Error : Request timed out"]}
...ignoring

</snip>

Version-Release number of selected component (if applicable):
---------------------------------------------------------------
RHGS 3.5.5 ( glusterfs-6.0-59.el8rhgs )
RHEL 8.5 ( 4.18.0-348.2.1.el8_5.x86_64 )

pacemaker-cli-2.1.0-8.el8.x86_64
pacemaker-schemas-2.1.0-8.el8.noarch
pacemaker-2.1.0-8.el8.x86_64
pacemaker-cluster-libs-2.1.0-8.el8.x86_64
pacemaker-libs-2.1.0-8.el8.x86_64

corosynclib-3.1.5-1.el8.x86_64
corosync-3.1.5-1.el8.x86_64

pcs-0.10.10-4.el8.x86_64
pacemaker-cli-2.1.0-8.el8.x86_64
pacemaker-schemas-2.1.0-8.el8.noarch
pacemaker-2.1.0-8.el8.x86_64
pacemaker-cluster-libs-2.1.0-8.el8.x86_64
pacemaker-libs-2.1.0-8.el8.x86_64
corosynclib-3.1.5-1.el8.x86_64
corosync-3.1.5-1.el8.x86_64

nfs-ganesha-3.4-8.el8rhgs.x86_64
nfs-ganesha-gluster-3.4-8.el8rhgs.x86_64
nfs-ganesha-selinux-3.4-8.el8rhgs.noarch
resource-agents-4.1.1-98.el8.x86_64

How reproducible:
------------------
Always

Steps to Reproduce:
-------------------
1. Create 3 node cluster with RHGS 3.5.5 with RHEL 8.5 platform
2. Create a volume
3. Deploy NFS ganesha using gdeploy

Actual results:
---------------
NFS ganesha deployment fails, HA status as FAILOVER

Expected results:
-----------------
NFS ganesha deployment should succeed with HA status as HEALTHY

Additional info:
-----------------
I have tested the same with RHEL 8.4 and RHGS 3.5.5, everything works good.
But it fails with RHEL 8.5, which indicates this should be a platform specific or HA rpms related regression. So adding the keyword 'Regression'

One another observation is that this error pops up during the execution of 'gluster nfs-ganesha enable' and after that point, all the gluster commands on the node ( where the ganesha deployment is attempted ) is stuck till timeout.

Comment 8 Kaleb KEITHLEY 2021-12-01 15:22:19 UTC
https://github.com/gluster/glusterfs/pull/2999

Comment 9 Kaleb KEITHLEY 2021-12-21 12:36:00 UTC
*** Bug 2033272 has been marked as a duplicate of this bug. ***

Comment 23 errata-xmlrpc 2022-05-31 12:37:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (glusterfs bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:4840


Note You need to log in before you can comment on or make changes to this bug.