Bug 1741102

Summary:	host activation causes RHHI nodes to lose the quorum
Product:	Red Hat Enterprise Virtualization Manager	Reporter:	Roman Hodain <rhodain>
Component:	ovirt-engine	Assignee:	Gobinda Das <godas>
Status:	CLOSED ERRATA	QA Contact:	SATHEESARAN <sasundar>
Severity:	urgent	Docs Contact:
Priority:	urgent
Version:	4.3.4	CC:	aoconnor, daniel.milewski, lsantann, mkalinin, obockows, pelauter, rcyriac, rhinduja, sabose
Target Milestone:	ovirt-4.4.0	Keywords:	ZStream
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	rhv-4.4.0-29	Doc Type:	No Doc Update
Doc Text:		Story Points:	---
Clone Of:
Clones:	1751142 1809413 (view as bug list)		Environment:
Last Closed:	2020-08-04 13:20:00 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	Gluster	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1547768, 1751142, 1809413

Description Roman Hodain 2019-08-14 09:32:36 UTC

Description of problem:
When a host providing gluster services for RHHI is activated, the RHV-M initiates gluster services restart. This causes gluster volumes to lose quorum and stop working.


Version-Release number of selected component (if applicable):
rhvm 4.3.3

How reproducible:
100%

Steps to Reproduce:
1. Put a gluster node into maintenance
2. Set debug level
    vdsm-client Host setLogLevel level=DEBUG name=jsonrpc
#. Activate the node again

Actual results:
2019-08-14 09:30:00,866+0000 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer] Calling 'GlusterService.action' in bridge with {u'action': u'restart', u'serviceNames': [u'glusterd']} (__init__:329)

Expected results:
No restart happens. Keep in mind that the node can be put into maintenance mode without stopping the gluster services.

Comment 4 Daniel Gur 2019-08-28 13:11:32 UTC

sync2jira

Comment 5 Daniel Gur 2019-08-28 13:15:44 UTC

sync2jira

Comment 9 Sahina Bose 2019-09-18 07:35:14 UTC

*** Bug 1751299 has been marked as a duplicate of this bug. ***

Comment 10 Gobinda Das 2019-11-21 09:29:10 UTC

This is already fixed in 4.3.6, do we still need this for 4.4.0?

Comment 12 SATHEESARAN 2020-03-03 05:34:13 UTC

(In reply to Gobinda Das from comment #10)
> This is already fixed in 4.3.6, do we still need this for 4.4.0?

I think the qualification from QE side at least should happen in RHV 4.4.0

Comment 14 SATHEESARAN 2020-05-07 02:12:18 UTC

Verified with RHV Version 4.4.0-0.33.master.el8ev
0. Note down the process ID (PID) of glusterd process
1. Moved the host to maintenance without stopping gluster services
2. Activate the host back
3. Note down the process ID(PID) of the glusterd process

Process ID(PID) of glusterd remains the same, which means that the glusterd service is not restarted,
when the host is activated, which was earlier put in to maintenance without stopping glusterd service

Before the host was put in to maintenance, without stopping gluster services
[root@ ~]# pidof glusterd
650894

After the host is activated
[root@ ~]# pidof glusterd
650894

Comment 17 errata-xmlrpc 2020-08-04 13:20:00 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: RHV Manager (ovirt-engine) 4.4 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:3247