1751717 – Host activation causes RHHI nodes to lose the quorum

Bug 1751717 - Host activation causes RHHI nodes to lose the quorum

Summary: Host activation causes RHHI nodes to lose the quorum

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	rhhi
Sub Component:
Version:	rhgs-3.4
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	RHHI-V 1.6.z Async Update
Assignee:	Gobinda Das
QA Contact:	SATHEESARAN
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1746789 (view as bug list)
Depends On:	1751142
Blocks:
TreeView+	depends on / blocked

Reported:	2019-09-12 13:01 UTC by SATHEESARAN
Modified:	2019-12-23 17:03 UTC (History)
CC List:	5 users (show)
Fixed In Version:	ovirt-engine-4.3.6.6
Doc Type:	Bug Fix
Doc Text:	Previously, the gluster service was started during host activation even when it had not previously been stopped. This restart caused volumes to lose quorum and stop working. The state of the gluster service is now checked during host activation, and the service is started only if it was previously stopped, which avoids this issue.
Clone Of:
Environment:
Last Closed:	2019-10-03 12:24:16 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2019:2963	0	None	None	None	2019-10-03 12:24:21 UTC

Description SATHEESARAN 2019-09-12 13:01:20 UTC

Description of problem:
-----------------------
When a host providing gluster services for RHHI is activated, the RHV-M initiates gluster services restart. This causes gluster volumes to lose quorum and stop working.


Version-Release number of selected component (if applicable):
-------------------------------------------------------------
rhvm 4.3.3

How reproducible:
-----------------
100%

Steps to Reproduce:
--------------------
1. Put a gluster node into maintenance
2. Set debug level
    vdsm-client Host setLogLevel level=DEBUG name=jsonrpc
#. Activate the node again

Actual results:
----------------
2019-08-14 09:30:00,866+0000 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer] Calling 'GlusterService.action' in bridge with {u'action': u'restart', u'serviceNames': [u'glusterd']} (__init__:329)

Expected results:
-------------------
No restart happens. Keep in mind that the node can be put into maintenance mode without stopping the gluster services.

Comment 2 SATHEESARAN 2019-09-25 14:38:16 UTC

The dependent bug is already ON_QA

Comment 3 SATHEESARAN 2019-09-25 16:49:10 UTC

Verified with RHV 4.3.6.6 with the following steps

1. Created a HC cluster
2. Move the HC node in to MAINTENANCE, without stopping gluster services
3. Note the PID of glusterd process and brick process.
4. Activate the host back

Observed that the gluster process PID remained the same post the host is activated

Comment 4 Gobinda Das 2019-09-27 06:51:08 UTC

*** Bug 1746789 has been marked as a duplicate of this bug. ***

Comment 6 errata-xmlrpc 2019-10-03 12:24:16 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2963

Note You need to log in before you can comment on or make changes to this bug.