Bug 1857793 - Global Maintenance by cli causes penalizing score of the host with HE VM on it.
Summary: Global Maintenance by cli causes penalizing score of the host with HE VM on it.
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-hosted-engine-ha
Classification: oVirt
Component: Agent
Version: ---
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ovirt-4.4.3
: ---
Assignee: Asaf Rachmani
QA Contact: Polina
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-07-16 15:10 UTC by Polina
Modified: 2020-11-11 06:39 UTC (History)
3 users (show)

Fixed In Version: ovirt-hosted-engine-ha-2.4.5
Clone Of:
Environment:
Last Closed: 2020-11-11 06:39:35 UTC
oVirt Team: Integration
Embargoed:
pm-rhel: ovirt-4.4+


Attachments (Terms of Use)
logs (3.18 MB, application/gzip)
2020-07-16 15:10 UTC, Polina
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 111418 0 master MERGED agent: Do not check memory in Global Maintenance 2020-10-20 09:47:07 UTC

Description Polina 2020-07-16 15:10:16 UTC
Created attachment 1701412 [details]
logs

Description of problem:
Global Maintenance by cli 'hosted-engine --set-maintenance --mode=global' causes penalizing score of the host with HE VM on it. host must remain with score 3400 while global maintenance

Version-Release number of selected component (if applicable):
rhv-4.4.1-11

How reproducible:happened several times while Automation tier2. I didn't reproduced it manually. Affected the tests results a lot. 

Steps to Reproduce:
1. We have three hosts in healthy state, max score 3400 each. HE VM is running on host1
2. Sending Global Maintenance to the engine by cli , like 'hosted-engine --set-maintenance --mode=global' causes penalizing score of the host1.

MainThread::INFO::2020-07-13 20:52:28,662::hosted_engine::517::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineUp (score: 3400)
MainThread::INFO::2020-07-13 20:52:38,691::state_decorators::51::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check) Global maintenance detected
MainThread::INFO::2020-07-13 20:52:38,747::brokerlink::73::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Success, was notification of state_transition (EngineUp-GlobalMaintenance) sent? ignored
MainThread::INFO::2020-07-13 20:52:38,908::states::72::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_penalize_memory) Penalizing score by 400 due to free memory 16069 being lower than required 16384
MainThread::INFO::2020-07-13 20:52:38,908::hosted_engine::517::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state GlobalMaintenance (score: 3000)

As soon as the 'hosted-engine --set-maintenance --mode=none' is sent this penalizing is canceled


Actual results:

HE VM is already running on the host and it must not be a problem that the host has 16069 MB free memory after this. This is not the reason for penalizing score.


Expected results:
After global maintenance the expected score must be 3400.

Additional info:

Comment 1 Michal Skrivanek 2020-07-17 04:31:44 UTC
> HE VM is already running on the host and it must not be a problem that the host has 16069 MB free memory after this. This is not the reason for penalizing score.

Are you saying that reason is bogus and there is more memory available, or that it considers it in addition to the currently used memory by HE VM?
Anyway, it doesn’t have any ill effect, does it?

Comment 2 Polina 2020-07-19 08:01:03 UTC
(In reply to Michal Skrivanek from comment #1)
> > HE VM is already running on the host and it must not be a problem that the host has 16069 MB free memory after this. This is not the reason for penalizing score.
> 
> Are you saying that reason is bogus and there is more memory available, or
> that it considers it in addition to the currently used memory by HE VM?

the second - The HE VM is already running on this host. And it is completely ok that the host has 16069 MB memory left and must not be a reason for penalizing score.

> Anyway, it doesn’t have any ill effect, does it?

it looks like a changed behavior (we have a check that after sending to Global Maintenance the score for all hosts is 3400). As often happens the ill effect is for automation runs (I don't know if users could be affected by this change) - in many cases we check the hosts' health in the setup , before the test starts , and in such cases the test will fail on setup stage.

Comment 4 Polina 2020-10-21 20:32:29 UTC
verified on ovirt-engine-4.4.3.7-0.22.el8ev.noarch by running automation tests

Comment 5 Sandro Bonazzola 2020-11-11 06:39:35 UTC
This bugzilla is included in oVirt 4.4.3 release, published on November 10th 2020.

Since the problem described in this bug report should be resolved in oVirt 4.4.3 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.