Bug 1406622 - [HE - Logging] Improve logging of uptodate score
Summary: [HE - Logging] Improve logging of uptodate score
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-hosted-engine-ha
Version: 3.6.9
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: ovirt-4.1.2
: ---
Assignee: Denis Chaplygin
QA Contact: Nikolai Sednev
URL:
Whiteboard:
Depends On:
Blocks: 1432880
TreeView+ depends on / blocked
 
Reported: 2016-12-21 04:51 UTC by Germano Veit Michel
Modified: 2019-04-28 14:05 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1432880 (view as bug list)
Environment:
Last Closed: 2017-06-07 13:32:47 UTC
oVirt Team: SLA
Target Upstream Version:
Embargoed:
gklein: testing_plan_complete-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 72435 0 master MERGED he: Added additional logging to the score calculation 2017-03-09 15:39:11 UTC
oVirt gerrit 72438 0 v2.1.z MERGED he: Added additional logging to the score calculation 2017-03-09 16:49:54 UTC
oVirt gerrit 74154 0 ovirt-hosted-engine-ha-1.3 MERGED he: Added additional logging to the score calculation 2017-03-20 11:39:06 UTC

Description Germano Veit Michel 2016-12-21 04:51:02 UTC
Description of problem:

I have recently worked in a case that one of the hosts kept the 2400 score even after everything was upgraded correctly. All is working fine, except this host which continues with this score. Restaring the ha daemons does not make any difference. It extracts the OVFs just fine, no errors and does not report any errors, but subtracts 1000 from the 3400 base score. This is possibly what was happening in BZ #1337960, which was closed due to insufficient data.

After following the logs (debug enabled), I suspect it was caused by "if spuuid != constants.BLANK_UUID" below, but since it's not logged, it's hard to say.

In order for us to clearly see why and troubleshoot this further the next time it happens, could you please improve logging in these two places:

ovirt_hosted_engine_ha/lib/upgrade.py

    def is_conf_file_uptodate(self):
        uptodate = False
        try:
            volume = self._config.get(config.ENGINE, config.CONF_VOLUME_UUID)
            self._log.debug('Conf volume: %s ' % volume)
            _image = self._config.get(config.ENGINE, config.CONF_IMAGE_UUID)
            self._log.debug('Conf image: %s ' % _image)
            spuuid = self._config.get(config.ENGINE, config.SP_UUID)
[1]         if spuuid == constants.BLANK_UUID:  
                uptodate = True
        except (KeyError, ValueError):
            uptodate = False
        return uptodate

ovirt_hosted_engine_ha/agent/states.py

        score = score_cfg['base-score']

        upgrademgr = upgrade.Upgrade()
        if not upgrademgr.is_conf_file_uptodate():
            score -= score_cfg['not-uptodate-config-penalty']
            [2]

These are the requests:
[1] Print a message when this condition is not met.
[2] Add a "Penalizing score ..." message as we have for other cases

Finally, please also consider backporting it to 4.0 and 3.6.

Version-Release number of selected component (if applicable):
ovirt-hosted-engine-ha-1.3.8
master

Comment 1 Germano Veit Michel 2016-12-23 00:32:53 UTC
It's actually:

ovirt-hosted-engine-ha-1.3.5.8-1.el7ev.noarch

Comment 2 Doron Fediuck 2017-01-09 12:19:51 UTC
This is more of a debugging level, than standard logging output.
So when implementing please ensure it gets the relevant log level.

Comment 6 Nikolai Sednev 2017-03-15 11:11:06 UTC
3.5->3.6 upgrade flow should provide the reproduction, see https://bugzilla.redhat.com/show_bug.cgi?id=1337960 and if it will be hit, then this bug fix should report ['not-uptodate-config-penalty'] with 2400 score (penalty of 1000). But if you won't back-port this fix to 3.6, then only after 3.6->4.0 upgrade, the fix would be verified.

Comment 9 Nikolai Sednev 2017-04-27 12:12:34 UTC
Forth to results received in bug#1432880, I've got the same results aslo for latest 4.1.2.
Moving to verified.

Comment 10 Yaniv Kaul 2017-06-06 21:34:56 UTC
Bronce, any idea why it's not closed-currentrelease?

Comment 11 Bronce McClain 2017-06-07 13:32:47 UTC
(In reply to Yaniv Kaul from comment #10)
> Bronce, any idea why it's not closed-currentrelease?

It obviously missed an errata and I didn't mark it closed. I'll do that now.


Note You need to log in before you can comment on or make changes to this bug.