Bug 1205724

Summary: [RFE][HC] Host in maintenance mode should stop glusterd and glusterfsd processes
Product: [oVirt] ovirt-engine Reporter: Sahina Bose <sabose>
Component: RFEsAssignee: Ramesh N <rnachimu>
Status: CLOSED CURRENTRELEASE QA Contact: SATHEESARAN <sasundar>
Severity: medium Docs Contact:
Priority: medium    
Version: ---CC: bugs, dfediuck, gklein, iheim, knarra, lsurette, mgoldboi, penguin.wrangler, rbalakri, rnachimu, sabose, sbonazzo, shtripat, yeylon, ykaul, ylavi
Target Milestone: ovirt-3.6.2Keywords: FutureFeature, Improvement
Target Release: 3.6.2Flags: rule-engine: ovirt-3.6.z+
rule-engine: planning_ack+
rule-engine: devel_ack+
rule-engine: testing_ack+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Feature: To help in upgrades and maintenance of gluster nodes, glusterd and related gluster services are stopped when a host is moved to Maintenance mode in engine
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-03-11 07:21:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Gluster RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1177771, 1205727, 1277939    

Description Sahina Bose 2015-03-25 14:21:57 UTC
Description of problem:
If gluster service is enabled on the host being put into maintenance, then glusterd and brick processes on the host should be stopped. 

This is required to stop clients from accessing the data, if the host is going to be upgraded.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
NA

Comment 1 Itamar Heim 2015-03-29 08:55:27 UTC
should this be optional?
should it be blocked if another host in the brick-set is still healing?

Comment 2 Sahina Bose 2015-04-06 12:16:50 UTC
I think in such cases, putting the host to maintenance mode should be blocked.

Comment 3 Shubhendu Tripathi 2015-07-14 09:04:44 UTC
The BZs #1213291 and #1196433 would take care of not allowing other nodes to maintenance state.

Comment 4 Red Hat Bugzilla Rules Engine 2015-10-19 10:52:37 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 5 Sahina Bose 2015-12-11 06:24:53 UTC
Retargeting as this is required for HC mode operations

Comment 6 Sandro Bonazzola 2015-12-22 13:08:36 UTC
Please don't leave bugs assigned to bugs when you take it.

Comment 7 Sahina Bose 2015-12-23 05:32:21 UTC
Moving host checks before moving to maintenance to separate bz

Comment 8 Sandro Bonazzola 2015-12-23 13:41:28 UTC
oVirt 3.6.2 RC1 has been released for testing, moving to ON_QA

Comment 9 Jillian Morgan 2016-01-31 16:28:34 UTC
> Itamar Heim 2015-03-29 04:55:27 EDT
>
>should this be optional?
>should it be blocked if another host in the brick-set is still healing?

> Shubhendu Tripathi 2015-07-14 05:04:44 EDT
>
>The BZs #1213291 and #1196433 would take care of not allowing other
>nodes to maintenance state.

I definitely agree that this should be optional, and definitely NOT the default behaviour, until the two above-noted BZs are implemented as well! As it now, 3.6.2 released with this change will have a very high likelyhood of breaking my oVirt cluster when moving a node to maintenance without performing very careful checks on the nodes first.

Even if the nodes are in a state that is "safe" to stop one of the glusterds, it will unnecessarily require a heal to be performed after re-activating the node.

Perhaps this isn't such a big deal in "big" clusters with dozens of nodes, but on a minimal 3-node cluster, taking one gluster node offline unnecessarily is a big risk.

There are many reasons to want to put the VDSM/hypervisor into maintenance, but NOT disrupt the gluster daemons on the same node!

As it works now (<= 3.6.1) I can safely move a node to maintenance, from the vdsm/hypervisor point-of-view without affecting the operational state of the gluster volumes.

I am now stuck on 3.6.1 until this change is reverted or augmented to be optional when placing a node into maintenance.

I recommend that this change be reverted, and this BZ should be set to depend on the two above-noted BZs.

Comment 11 Sahina Bose 2016-02-01 08:51:15 UTC
Will rework this to make it optional, agree with the user's concerns.

Raised bug 1303539 to track it.

Comment 12 SATHEESARAN 2016-02-25 10:29:24 UTC
Tested with RHEV 3.6.3.3 and RHGS 3.1.2 RC ( glusterfs-3.7.5.19.el7rhgs )

gluster services are not stopped forcefully but an option is provided for stopping the gluster services while moving the host to maintenance