+++ This bug is a downstream clone. The original bug is: +++ +++ bug 1406622 +++ ====================================================================== Description of problem: I have recently worked in a case that one of the hosts kept the 2400 score even after everything was upgraded correctly. All is working fine, except this host which continues with this score. Restaring the ha daemons does not make any difference. It extracts the OVFs just fine, no errors and does not report any errors, but subtracts 1000 from the 3400 base score. This is possibly what was happening in BZ #1337960, which was closed due to insufficient data. After following the logs (debug enabled), I suspect it was caused by "if spuuid != constants.BLANK_UUID" below, but since it's not logged, it's hard to say. In order for us to clearly see why and troubleshoot this further the next time it happens, could you please improve logging in these two places: ovirt_hosted_engine_ha/lib/upgrade.py def is_conf_file_uptodate(self): uptodate = False try: volume = self._config.get(config.ENGINE, config.CONF_VOLUME_UUID) self._log.debug('Conf volume: %s ' % volume) _image = self._config.get(config.ENGINE, config.CONF_IMAGE_UUID) self._log.debug('Conf image: %s ' % _image) spuuid = self._config.get(config.ENGINE, config.SP_UUID) [1] if spuuid == constants.BLANK_UUID: uptodate = True except (KeyError, ValueError): uptodate = False return uptodate ovirt_hosted_engine_ha/agent/states.py score = score_cfg['base-score'] upgrademgr = upgrade.Upgrade() if not upgrademgr.is_conf_file_uptodate(): score -= score_cfg['not-uptodate-config-penalty'] [2] These are the requests: [1] Print a message when this condition is not met. [2] Add a "Penalizing score ..." message as we have for other cases Finally, please also consider backporting it to 4.0 and 3.6. Version-Release number of selected component (if applicable): ovirt-hosted-engine-ha-1.3.8 master (Originally by Germano Veit Michel)
It's actually: ovirt-hosted-engine-ha-1.3.5.8-1.el7ev.noarch (Originally by Germano Veit Michel)
This is more of a debugging level, than standard logging output. So when implementing please ensure it gets the relevant log level. (Originally by Doron Fediuck)
3.5->3.6 upgrade flow should provide the reproduction, see https://bugzilla.redhat.com/show_bug.cgi?id=1337960 and if it will be hit, then this bug fix should report ['not-uptodate-config-penalty'] with 2400 score (penalty of 1000). But if you won't back-port this fix to 3.6, then only after 3.6->4.0 upgrade, the fix would be verified. (Originally by Nikolai Sednev)
Logging was improved for latest 3.6.11: Engine: rhevm-setup-plugin-vmconsole-proxy-helper-3.6.11.1-0.1.el6.noarch rhevm-sdk-python-3.6.9.1-1.el6ev.noarch ovirt-host-deploy-java-1.4.1-1.el6ev.noarch rhevm-spice-client-x64-msi-3.6-7.el6.noarch rhevm-setup-base-3.6.11.1-0.1.el6.noarch rhevm-setup-plugin-websocket-proxy-3.6.11.1-0.1.el6.noarch rhevm-doc-3.6.10-1.el6ev.noarch rhevm-spice-client-x86-msi-3.6-7.el6.noarch rhevm-backend-3.6.11.1-0.1.el6.noarch ovirt-setup-lib-1.0.1-1.el6ev.noarch rhevm-setup-3.6.11.1-0.1.el6.noarch rhevm-cli-3.6.9.0-1.el6ev.noarch ovirt-vmconsole-proxy-1.0.4-1.el6ev.noarch rhevm-restapi-3.6.11.1-0.1.el6.noarch rhevm-dwh-3.6.8-1.el6ev.noarch ovirt-engine-extension-aaa-jdbc-1.0.7-2.el6ev.noarch rhevm-setup-plugin-ovirt-engine-3.6.11.1-0.1.el6.noarch rhevm-reports-setup-3.6.5.1-1.el6ev.noarch rhevm-iso-uploader-3.6.0-1.el6ev.noarch ovirt-host-deploy-1.4.1-1.el6ev.noarch rhevm-vmconsole-proxy-helper-3.6.11.1-0.1.el6.noarch rhevm-spice-client-x64-cab-3.6-7.el6.noarch rhevm-tools-3.6.11.1-0.1.el6.noarch rhevm-3.6.11.1-0.1.el6.noarch rhevm-lib-3.6.11.1-0.1.el6.noarch rhevm-dwh-setup-3.6.8-1.el6ev.noarch rhevm-log-collector-3.6.1-1.el6ev.noarch rhevm-websocket-proxy-3.6.11.1-0.1.el6.noarch rhevm-extensions-api-impl-3.6.11.1-0.1.el6.noarch rhevm-webadmin-portal-3.6.11.1-0.1.el6.noarch rhevm-guest-agent-common-1.0.11-6.el6ev.noarch rhevm-setup-plugins-3.6.5-1.el6ev.noarch rhevm-image-uploader-3.6.1-2.el6ev.noarch rhevm-branding-rhev-3.6.0-10.el6ev.noarch ovirt-vmconsole-1.0.4-1.el6ev.noarch rhevm-userportal-3.6.11.1-0.1.el6.noarch rhevm-reports-3.6.5.1-1.el6ev.noarch rhev-guest-tools-iso-3.6-6.el6ev.noarch rhevm-setup-plugin-ovirt-engine-common-3.6.11.1-0.1.el6.noarch rhevm-dependencies-3.6.1-1.el6ev.noarch rhevm-tools-backup-3.6.11.1-0.1.el6.noarch rhevm-spice-client-x86-cab-3.6-7.el6.noarch rhevm-dbscripts-3.6.11.1-0.1.el6.noarch Linux version 2.6.32-642.13.2.el6.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-17) (GCC) ) #1 SMP Sat Feb 18 16:54:58 EST 2017 Linux 2.6.32-642.13.2.el6.x86_64 #1 SMP Sat Feb 18 16:54:58 EST 2017 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 6.8 (Santiago) Hosts: ovirt-vmconsole-1.0.4-1.el7ev.noarch ovirt-hosted-engine-ha-1.3.5.10-1.el7ev.noarch ovirt-setup-lib-1.0.1-1.el7ev.noarch sanlock-3.4.0-1.el7.x86_64 mom-0.5.6-1.el7ev.noarch ovirt-hosted-engine-setup-1.3.7.4-1.el7ev.noarch ovirt-vmconsole-host-1.0.4-1.el7ev.noarch vdsm-4.17.39-1.el7ev.noarch ovirt-host-deploy-1.4.1-1.el7ev.noarch libvirt-client-2.0.0-10.el7_3.5.x86_64 qemu-kvm-rhev-2.6.0-28.el7_3.9.x86_64 rhevm-sdk-python-3.6.9.1-1.el7ev.noarch Linux version 3.10.0-327.53.1.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) ) #1 SMP Tue Mar 14 10:49:09 EDT 2017 Linux 3.10.0-327.53.1.el7.x86_64 #1 SMP Tue Mar 14 10:49:09 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 7.3 (Maipo) /usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/states.py : upgrademgr = upgrade.Upgrade() if not upgrademgr.is_conf_file_uptodate(): logger.info("Penalizing score by %d due to " "not up-to-date VM configuration", score_cfg['not-uptodate-config-penalty']) score -= score_cfg['not-uptodate-config-penalty'] /usr/lib/python2.7/site-packages/vdsm/tool/upgrade.py def _upgrade_seal(upgrade): seal_file = _upgrade_seal_path(upgrade) try: utils.touchFile(seal_file) except (OSError, IOError): _get_upgrade_log().exception("Failed to seal upgrade %s", upgrade.name) else: utils.persist(seal_file) _get_upgrade_log().debug("Upgrade %s successfully performed", upgrade.name) /usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/upgrade.py: def upgrade_35_36(self): uptodate = self.is_conf_file_uptodate() if uptodate: self._log.info('Host configuration is already up-to-date') return False else: self._log.info('Upgrading to current version') if not self._is_in_engine_maintenance(): self._log.error( 'Unable to upgrade while not in maintenance mode: ' 'please put this host into maintenance mode ' 'from the engine, and manually restart this service ' 'when ready' ) return False self._startMonitoringDomain() content = self._get_conffile_content( constants.HEConfFiles.HECONFD_HECONF, self._update_conf_file_35_36, ) self._wrote_updated_conf_file(content) self._log.info('Successfully upgraded') return True Moving to verified as I've managed to successfully upgrade from 3.5 to 3.6, without any problems with score (got 3400 score) on 3.5->3.6 upgraded hosts. Within the logs I've seen normal logging and hosted-storage was successfully upgraded: MainThread::INFO::2017-04-24 18:21:11,331::upgrade::823::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(_remove_storage_pool) Successfully removed the shared pool MainThread::INFO::2017-04-24 18:21:42,222::upgrade::1046::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(upgrade_35_36) Successfully upgraded
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1213