Bug 1205724

Summary:	[RFE][HC] Host in maintenance mode should stop glusterd and glusterfsd processes
Product:	[oVirt] ovirt-engine	Reporter:	Sahina Bose <sabose>
Component:	RFEs	Assignee:	Ramesh N <rnachimu>
Status:	CLOSED CURRENTRELEASE	QA Contact:	SATHEESARAN <sasundar>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	---	CC:	bugs, dfediuck, gklein, iheim, knarra, lsurette, mgoldboi, penguin.wrangler, rbalakri, rnachimu, sabose, sbonazzo, shtripat, yeylon, ykaul, ylavi
Target Milestone:	ovirt-3.6.2	Keywords:	FutureFeature, Improvement
Target Release:	3.6.2	Flags:	rule-engine: ovirt-3.6.z+ rule-engine: planning_ack+ rule-engine: devel_ack+ rule-engine: testing_ack+
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Enhancement
Doc Text:	Feature: To help in upgrades and maintenance of gluster nodes, glusterd and related gluster services are stopped when a host is moved to Maintenance mode in engine	Story Points:	---
Clone Of:		Environment:
Last Closed:	2016-03-11 07:21:29 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	Gluster	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1177771, 1205727, 1277939

Description Sahina Bose 2015-03-25 14:21:57 UTC

Description of problem:
If gluster service is enabled on the host being put into maintenance, then glusterd and brick processes on the host should be stopped. 

This is required to stop clients from accessing the data, if the host is going to be upgraded.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
NA

Comment 1 Itamar Heim 2015-03-29 08:55:27 UTC

should this be optional?
should it be blocked if another host in the brick-set is still healing?

Comment 2 Sahina Bose 2015-04-06 12:16:50 UTC

I think in such cases, putting the host to maintenance mode should be blocked.

Comment 3 Shubhendu Tripathi 2015-07-14 09:04:44 UTC

The BZs #1213291 and #1196433 would take care of not allowing other nodes to maintenance state.

Comment 4 Red Hat Bugzilla Rules Engine 2015-10-19 10:52:37 UTC

Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 5 Sahina Bose 2015-12-11 06:24:53 UTC

Retargeting as this is required for HC mode operations

Comment 6 Sandro Bonazzola 2015-12-22 13:08:36 UTC

Please don't leave bugs assigned to bugs when you take it.

Comment 7 Sahina Bose 2015-12-23 05:32:21 UTC

Moving host checks before moving to maintenance to separate bz

Comment 8 Sandro Bonazzola 2015-12-23 13:41:28 UTC

oVirt 3.6.2 RC1 has been released for testing, moving to ON_QA

Comment 9 Jillian Morgan 2016-01-31 16:28:34 UTC

> Itamar Heim 2015-03-29 04:55:27 EDT
>
>should this be optional?
>should it be blocked if another host in the brick-set is still healing?

> Shubhendu Tripathi 2015-07-14 05:04:44 EDT
>
>The BZs #1213291 and #1196433 would take care of not allowing other
>nodes to maintenance state.

I definitely agree that this should be optional, and definitely NOT the default behaviour, until the two above-noted BZs are implemented as well! As it now, 3.6.2 released with this change will have a very high likelyhood of breaking my oVirt cluster when moving a node to maintenance without performing very careful checks on the nodes first.

Even if the nodes are in a state that is "safe" to stop one of the glusterds, it will unnecessarily require a heal to be performed after re-activating the node.

Perhaps this isn't such a big deal in "big" clusters with dozens of nodes, but on a minimal 3-node cluster, taking one gluster node offline unnecessarily is a big risk.

There are many reasons to want to put the VDSM/hypervisor into maintenance, but NOT disrupt the gluster daemons on the same node!

As it works now (<= 3.6.1) I can safely move a node to maintenance, from the vdsm/hypervisor point-of-view without affecting the operational state of the gluster volumes.

I am now stuck on 3.6.1 until this change is reverted or augmented to be optional when placing a node into maintenance.

I recommend that this change be reverted, and this BZ should be set to depend on the two above-noted BZs.

Comment 11 Sahina Bose 2016-02-01 08:51:15 UTC

Will rework this to make it optional, agree with the user's concerns.

Raised bug 1303539 to track it.

Comment 12 SATHEESARAN 2016-02-25 10:29:24 UTC

Tested with RHEV 3.6.3.3 and RHGS 3.1.2 RC ( glusterfs-3.7.5.19.el7rhgs )

gluster services are not stopped forcefully but an option is provided for stopping the gluster services while moving the host to maintenance