Description of problem: I have recently worked in a case that one of the hosts kept the 2400 score even after everything was upgraded correctly. All is working fine, except this host which continues with this score. Restaring the ha daemons does not make any difference. It extracts the OVFs just fine, no errors and does not report any errors, but subtracts 1000 from the 3400 base score. This is possibly what was happening in BZ #1337960, which was closed due to insufficient data. After following the logs (debug enabled), I suspect it was caused by "if spuuid != constants.BLANK_UUID" below, but since it's not logged, it's hard to say. In order for us to clearly see why and troubleshoot this further the next time it happens, could you please improve logging in these two places: ovirt_hosted_engine_ha/lib/upgrade.py def is_conf_file_uptodate(self): uptodate = False try: volume = self._config.get(config.ENGINE, config.CONF_VOLUME_UUID) self._log.debug('Conf volume: %s ' % volume) _image = self._config.get(config.ENGINE, config.CONF_IMAGE_UUID) self._log.debug('Conf image: %s ' % _image) spuuid = self._config.get(config.ENGINE, config.SP_UUID) [1] if spuuid == constants.BLANK_UUID: uptodate = True except (KeyError, ValueError): uptodate = False return uptodate ovirt_hosted_engine_ha/agent/states.py score = score_cfg['base-score'] upgrademgr = upgrade.Upgrade() if not upgrademgr.is_conf_file_uptodate(): score -= score_cfg['not-uptodate-config-penalty'] [2] These are the requests: [1] Print a message when this condition is not met. [2] Add a "Penalizing score ..." message as we have for other cases Finally, please also consider backporting it to 4.0 and 3.6. Version-Release number of selected component (if applicable): ovirt-hosted-engine-ha-1.3.8 master
It's actually: ovirt-hosted-engine-ha-1.3.5.8-1.el7ev.noarch
This is more of a debugging level, than standard logging output. So when implementing please ensure it gets the relevant log level.
3.5->3.6 upgrade flow should provide the reproduction, see https://bugzilla.redhat.com/show_bug.cgi?id=1337960 and if it will be hit, then this bug fix should report ['not-uptodate-config-penalty'] with 2400 score (penalty of 1000). But if you won't back-port this fix to 3.6, then only after 3.6->4.0 upgrade, the fix would be verified.
Forth to results received in bug#1432880, I've got the same results aslo for latest 4.1.2. Moving to verified.
Bronce, any idea why it's not closed-currentrelease?
(In reply to Yaniv Kaul from comment #10) > Bronce, any idea why it's not closed-currentrelease? It obviously missed an errata and I didn't mark it closed. I'll do that now.