1809413 – host activation causes RHHI nodes to lose the quorum

Bug 1809413 - host activation causes RHHI nodes to lose the quorum

Summary: host activation causes RHHI nodes to lose the quorum

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	rhhi
Sub Component:
Version:	rhhiv-1.8
Hardware:	x86_64
OS:	Linux
Priority:	urgent
Severity:	urgent
Target Milestone:	---
Target Release:	RHHI-V 1.8
Assignee:	Gobinda Das
QA Contact:	SATHEESARAN
Docs Contact:
URL:
Whiteboard:
Depends On:	1741102
Blocks:	RHHI-V-1.8-Engineering-Backlog-BZs
TreeView+	depends on / blocked

Reported:	2020-03-03 05:47 UTC by SATHEESARAN
Modified:	2020-08-04 14:51 UTC (History)
CC List:	13 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	Previously, activating the host from the Administrator Portal restarted the glusterd service which led to quorum loss when the glusterd process ID changed. With this release, the glusterd service does not restart if it is already up and running during the activation of the host, so the glusterd process ID does not change and there is no quorum loss.
Clone Of:	1741102
Environment:
Last Closed:	2020-08-04 14:51:33 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2020:3314	0	None	None	None	2020-08-04 14:51:55 UTC

Description SATHEESARAN 2020-03-03 05:47:06 UTC

+++ This bug was initially created as a clone of Bug #1741102 +++

Description of problem:
When a host providing gluster services for RHHI is activated, the RHV-M initiates gluster services restart. This causes gluster volumes to lose quorum and stop working.


Version-Release number of selected component (if applicable):
rhvm 4.3.3

How reproducible:
100%

Steps to Reproduce:
1. Put a gluster node into maintenance
2. Set debug level
    vdsm-client Host setLogLevel level=DEBUG name=jsonrpc
#. Activate the node again

Actual results:
2019-08-14 09:30:00,866+0000 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer] Calling 'GlusterService.action' in bridge with {u'action': u'restart', u'serviceNames': [u'glusterd']} (__init__:329)

Expected results:
No restart happens. Keep in mind that the node can be put into maintenance mode without stopping the gluster services.

Comment 1 SATHEESARAN 2020-05-07 02:11:39 UTC

Verified with RHV Version 4.4.0-0.33.master.el8ev
0. Note down the process ID (PID) of glusterd process
1. Moved the host to maintenance without stopping gluster services
2. Activate the host back
3. Note down the process ID(PID) of the glusterd process

Process ID(PID) of glusterd remains the same, which means that the glusterd service is not restarted,
when the host is activated, which was earlier put in to maintenance without stopping glusterd service

Before the host was put in to maintenance, without stopping gluster services
[root@ ~]# pidof glusterd
650894

After the host is activated
[root@ ~]# pidof glusterd
650894

Comment 6 errata-xmlrpc 2020-08-04 14:51:33 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (RHHI for Virtualization 1.8 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:3314

Note You need to log in before you can comment on or make changes to this bug.