Bug 1666786 - RHV-M reports "Balancing VM ${VM}" for ever as successful in the tasks list
Summary: RHV-M reports "Balancing VM ${VM}" for ever as successful in the tasks list
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.2.7
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ovirt-4.4.5
: ---
Assignee: Lucia Jelinkova
QA Contact: Qin Yuan
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-01-16 14:46 UTC by Roman Hodain
Modified: 2021-04-14 11:41 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-04-14 11:39:53 UTC
oVirt Team: Virt
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2021:1169 0 None None None 2021-04-14 11:41:00 UTC
oVirt gerrit 99706 0 None MERGED core: Affinity enforcer checks if migration would improve situation 2021-04-08 15:19:10 UTC

Description Roman Hodain 2019-01-16 14:46:05 UTC
Description of problem:
When the engine is not able to balance the VMs based on the VM positive soft affinity due to the fact that the VMs are spread across multiple hosts.

    Invalid affinity situation was detected while scheduling VM 'VM02' (2eefa135-bf8c-4ad0-a09e-d93cf6dd4f60
). VMs belonging to the same affinity groups are running on more than one host.

It repeats the operation untill it is possible. It may not be possible due to another scheduling policies.

Version-Release number of selected component (if applicable):
4.2.7

How reproducible:
100%

Steps to Reproduce:
1.Create 3 VMs and run them on three different hosts
2.Make one of the hosts loaded by VMs so it dos not have furhter memory for VM scheduling.
3. Create positive affinity group for these three VMs and positive soft affinity for the host with no available memory.

Actual results:
Tasks list contain repeatedly contain message 

    Balancing VM test01

It is marked as green although nothing happened.

Expected results:
The message is not displayed or it is displayed as failed with proper explanation why the balancing failed and how to fix it.

Comment 1 Ryan Barry 2019-01-17 01:07:22 UTC
This is actually tricky

The balancing _did_ run successfully, but it failed to migrate due to constraints. We don't always expect customers to look at logs for information, though. From my point of view, I'd suggest that keeping the message is a good idea, even if it may seem misleading, otherwise customers will wonder why balancing is not attempted.

The best solution is probably breaking the balancing into a tree view in the event log in RHVM, and showing a failure when the application of affinity is aborted due to constraints

Comment 3 Roman Hodain 2019-01-17 06:15:08 UTC
(In reply to Ryan Barry from comment #1)
> This is actually tricky
> 
> The balancing _did_ run successfully, but it failed to migrate due to
> constraints. We don't always expect customers to look at logs for
> information, though. From my point of view, I'd suggest that keeping the
> message is a good idea, even if it may seem misleading, otherwise customers
> will wonder why balancing is not attempted.

The problem is that it is inconsistent behaviour. For example if you create soft host affinity group and the VM balancing is not successful (memory constraints). There will be no task in the task list.

> 
> The best solution is probably breaking the balancing into a tree view in the
> event log in RHVM, and showing a failure when the application of affinity is
> aborted due to constraints

That sounds good.

Comment 4 Daniel Gur 2019-08-28 13:12:55 UTC
sync2jira

Comment 5 Daniel Gur 2019-08-28 13:17:07 UTC
sync2jira

Comment 8 Michal Skrivanek 2020-03-17 12:16:49 UTC
deprecating SLA team usage, moving to Virt

Comment 9 Arik 2021-02-08 21:35:43 UTC
I believe this can't happen anymore now that we migrate only after checking that the migration would actually improve the VM state in terms of the affinity constraints [1].
Meital can we test it on 4.4.5 please?

[1] https://gerrit.ovirt.org/#/c/ovirt-engine/+/99706/

Comment 13 Qin Yuan 2021-02-16 09:48:26 UTC
Verified with:
ovirt-engine-4.4.5.4-0.6.el8ev.noarch
vdsm-4.40.50.4-1.el8ev.x86_64

Steps:
1. Create 3 VMs(vm1, vm2, vm3) and run them on 3 hosts(host_mixed_1, host_mixed_2, host_mixed_3) separately.
2. Create another VM(vm4) with big memory, run it on host_mixed_3, so host_mixed_3 doesn't have further memory for VM scheduling.
3. Create an affinity group:
   VM Affinity Rule: Positive     Enforcing: unchecked
   Host Affinity Rule: Positive   Enforcing: unchecked
   Virtual Machines: vm1, vm2, vm3
   Hosts: host_mixed_3

Results:
1. The affinity group is added successfully.
2. vm1 is migrated from host_mixed_1 to host_mixed_2.
3. In engine log, it reports "Candidate host 'host_mixed_3' was filtered out by 'VAR__FILTERTYPE__INTERNAL' filter 'Memory', and "Positive affinity group violation detected" periodically.
4. There is no repeatedly vm balancing tasks.

engine log:
2021-02-16 11:09:59,584+02 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-77) [33ee6901-7c47-44a4-a631-0ef0e56f7d22] EVENT_ID: USER_ADDED_AFFINITY_GROUP(10,350), Affinity Group test_group was added. (User: admin@internal-authz)
2021-02-16 11:10:58,469+02 INFO  [org.ovirt.engine.core.bll.scheduling.SchedulingManager] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-88) [] Candidate host 'host_mixed_3' ('636ca5b1-194c-40aa-9b03-9b6f468ec1d6') was filtered out by 'VAR__FILTERTYPE__INTERNAL' filter 'Memory' (correlation id: null)
2021-02-16 11:10:58,563+02 INFO  [org.ovirt.engine.core.bll.BalanceVmCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-88) [6f05c060] Lock Acquired to object 'EngineLock:{exclusiveLocks='[26370054-5f08-44b9-adf6-3c3f5716dfb5=VM]', sharedLocks=''}'
2021-02-16 11:10:58,588+02 INFO  [org.ovirt.engine.core.bll.scheduling.SchedulingManager] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-88) [6f05c060] Candidate host 'host_mixed_3' ('636ca5b1-194c-40aa-9b03-9b6f468ec1d6') was filtered out by 'VAR__FILTERTYPE__INTERNAL' filter 'Memory' (correlation id: null)
2021-02-16 11:10:58,597+02 INFO  [org.ovirt.engine.core.bll.BalanceVmCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-88) [6f05c060] Running command: BalanceVmCommand internal: true. Entities affected :  ID: 26370054-5f08-44b9-adf6-3c3f5716dfb5 Type: VMAction group MIGRATE_VM with role type USER
2021-02-16 11:10:58,600+02 INFO  [org.ovirt.engine.core.bll.scheduling.SchedulingManager] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-88) [6f05c060] Candidate host 'host_mixed_3' ('636ca5b1-194c-40aa-9b03-9b6f468ec1d6') was filtered out by 'VAR__FILTERTYPE__INTERNAL' filter 'Memory' (correlation id: 6f05c060)
...
2021-02-16 11:10:58,639+02 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-88) [6f05c060] EVENT_ID: VM_MIGRATION_START_SYSTEM_INITIATED(67), Migration initiated by system (VM: vm1, Source: host_mixed_1, Destination: host_mixed_2, Reason: Affinity rules enforcement).
2021-02-16 11:11:03,925+02 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-17) [3fefdcaa] EVENT_ID: VM_MIGRATION_DONE(63), Migration completed (VM: vm1, Source: host_mixed_1, Destination: host_mixed_2, Duration: 5 seconds, Total: 5 seconds, Actual downtime: (N/A))
2021-02-16 11:11:03,928+02 INFO  [org.ovirt.engine.core.bll.BalanceVmCommand] (ForkJoinPool-1-worker-17) [3fefdcaa] Lock freed to object 'EngineLock:{exclusiveLocks='[26370054-5f08-44b9-adf6-3c3f5716dfb5=VM]', sharedLocks=''}'
2021-02-16 11:11:58,735+02 INFO  [org.ovirt.engine.core.bll.scheduling.SchedulingManager] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-68) [] Candidate host 'host_mixed_3' ('636ca5b1-194c-40aa-9b03-9b6f468ec1d6') was filtered out by 'VAR__FILTERTYPE__INTERNAL' filter 'Memory' (correlation id: null)
2021-02-16 11:11:58,742+02 INFO  [org.ovirt.engine.core.bll.scheduling.arem.AffinityRulesEnforcer] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-68) [] Positive affinity group violation detected
2021-02-16 11:12:58,804+02 INFO  [org.ovirt.engine.core.bll.scheduling.SchedulingManager] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-45) [] Candidate host 'host_mixed_3' ('636ca5b1-194c-40aa-9b03-9b6f468ec1d6') was filtered out by 'VAR__FILTERTYPE__INTERNAL' filter 'Memory' (correlation id: null)
2021-02-16 11:12:58,865+02 INFO  [org.ovirt.engine.core.bll.scheduling.arem.AffinityRulesEnforcer] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-45) [] Positive affinity group violation detected
2021-02-16 11:13:58,906+02 INFO  [org.ovirt.engine.core.bll.scheduling.SchedulingManager] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-47) [] Candidate host 'host_mixed_3' ('636ca5b1-194c-40aa-9b03-9b6f468ec1d6') was filtered out by 'VAR__FILTERTYPE__INTERNAL' filter 'Memory' (correlation id: null)
2021-02-16 11:13:58,980+02 INFO  [org.ovirt.engine.core.bll.scheduling.arem.AffinityRulesEnforcer] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-47) [] Positive affinity group violation detected

Comment 21 errata-xmlrpc 2021-04-14 11:39:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: RHV Manager (ovirt-engine) 4.4.z [ovirt-4.4.5] security, bug fix, enhancement), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:1169


Note You need to log in before you can comment on or make changes to this bug.