Bug 1277642

Summary: The host is local maintenance for ovirt-ha-agent but it's still up for the engine
Product: [oVirt] ovirt-engine Reporter: Simone Tiraboschi <stirabos>
Component: BLL.HostedEngineAssignee: Sandro Bonazzola <sbonazzo>
Status: CLOSED DUPLICATE QA Contact: Ilanit Stein <istein>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.5.5CC: bugs, didi, lveyde, rgolan, rmartins, sbonazzo, stirabos
Target Milestone: ---Flags: rule-engine: planning_ack?
rule-engine: devel_ack?
rule-engine: testing_ack?
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-05 08:28:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Simone Tiraboschi 2015-11-03 17:43:42 UTC
Description of problem:
Our wiki ( http://www.ovirt.org/Features/Self_Hosted_Engine_Maintenance_Flows ) that 

'The two types of HA maintenance will be configured in different ways.
Local maintenance, which affects only the host on which it is enabled, will be tied into the existing VDS maintenance operation.'

But now after explicitly putting the engine into maintenance it remains up for the engine.
In order to really put the engine into maintenance it's necessary to do it from the engine.

Please see the attached screenshot:
1. Hosted Engine HA: Local Maintenance Enabled
2. Status: UP

On the host:
[root@c71het20151029 ~]# hosted-engine --vm-status


--== Host 1 status ==--

Status up-to-date                  : True
Hostname                           : c71het20151028.localdomain
Host ID                            : 1
Engine status                      : {"health": "good", "vm": "up", "detail": "up"}
Score                              : 2400
Local maintenance                  : False
Host timestamp                     : 9501
Extra metadata (valid at timestamp):
	metadata_parse_version=1
	metadata_feature_version=1
	timestamp=9501 (Tue Nov  3 18:22:14 2015)
	host-id=1
	score=2400
	maintenance=False
	state=EngineUp


--== Host 2 status ==--

Status up-to-date                  : True
Hostname                           : c71het20151029.localdomain
Host ID                            : 2
Engine status                      : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"}
Score                              : 0
Local maintenance                  : True
Host timestamp                     : 9499
Extra metadata (valid at timestamp):
	metadata_parse_version=1
	metadata_feature_version=1
	timestamp=9499 (Tue Nov  3 18:22:11 2015)
	host-id=2
	score=0
	maintenance=True
	state=LocalMaintenance

[root@c71het20151029 ~]# vdsClient -s 0 getVdsStats | grep -A 4 haStats
	haStats = {'active': True,
	           'configured': True,
	           'globalMaintenance': False,
	           'localMaintenance': True,
	           'score': 0}


setHaMaintenanceMode verb on VDSM always fails.


[root@c71het20151029 ~]# vdsClient -s 0 setHaMaintenanceMode type=local enabled=false
Failed to set Hosted Engine HA policy
[root@c71het20151029 ~]# vdsClient -s 0 setHaMaintenanceMode type=local enabled=true
Failed to set Hosted Engine HA policy
[root@c71het20151029 ~]# vdsClient -s 0 setHaMaintenanceMode type=none enabled=true
Failed to set Hosted Engine HA policy
[root@c71het20151029 ~]# vdsClient -s 0 setHaMaintenanceMode type=global enabled=true
Failed to set Hosted Engine HA policy



Version-Release number of selected component (if applicable):


How reproducible:
Fully reproducible

Steps to Reproduce:
1. Put an host in local maintenance and check what happens on the engine
2.
3.

Actual results:
ovirt-ha-agent says that the host is in local maintenacne and the engine says that it's up.

Expected results:
Putting an host into local maintenance makes it into maintenance mode also for the engine.

Additional info:
It doesn't seams a matter of time cause neither explicitly refreshing or waiting seams to solve.

Cause the host is not really in maintenance the engine keeps other (non HE) storageServer connections up it keeps it connected to the datacenter storagePool and so hosted-engine upgrade process fails cause VDSM refuses to connect to more than one storage pool (in 3.5 the hosted-engine storage domain was still connected to its bootstrap storage pool and the host needs to connect to that to correctly upgrade).

Comment 1 Roy Golan 2015-11-05 08:28:54 UTC

*** This bug has been marked as a duplicate of bug 1277646 ***