Bug 1537343 - engine tries to balance vms that are down.
Summary: engine tries to balance vms that are down.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.1.6
Hardware: x86_64
OS: Linux
high
medium
Target Milestone: ovirt-4.2.2
: ---
Assignee: Andrej Krejcir
QA Contact: Polina
URL:
Whiteboard:
Depends On:
Blocks: 1551582
TreeView+ depends on / blocked
 
Reported: 2018-01-23 01:04 UTC by Germano Veit Michel
Modified: 2021-06-10 14:32 UTC (History)
11 users (show)

Fixed In Version: ovirt-engine-4.2.2
Doc Type: Bug Fix
Doc Text:
Previously, if a virtual machine with a strong positive affinity to a host was down, the affinity rules enforcer tried to migrate it, because it was not running on the specified host. When migration failed, the affinity rules enforcer tried repeatedly to migrate the same virtual machine, ignoring other virtual machines that violated affinity. In the current release, the affinity rules enforcer ignores virtual machines that are down.
Clone Of:
: 1551582 (view as bug list)
Environment:
Last Closed: 2018-05-15 17:47:24 UTC
oVirt Team: SLA
Target Upstream Version:
lsvaty: testing_plan_complete-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3329091 0 None None None 2018-01-23 01:34:50 UTC
Red Hat Product Errata RHEA-2018:1488 0 None None None 2018-05-15 17:48:17 UTC
oVirt gerrit 87259 0 master MERGED core: Check affinity rules only for running VMs 2021-02-15 18:01:34 UTC
oVirt gerrit 87320 0 ovirt-engine-4.2 MERGED core: Check affinity rules only for running VMs 2021-02-15 18:01:34 UTC

Description Germano Veit Michel 2018-01-23 01:04:36 UTC
Description of problem:

Logs spammed with:

2018-01-23 00:48:05,541Z WARN  [org.ovirt.engine.core.bll.BalanceVmCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-45) [ec22cb2] Validation of action 'BalanceVm' failed for user SYSTEM. Reasons: VAR__ACTION__MIGRATE,VAR__TYPE__VM,ACTION_TYPE_FAILED_VM_IS_NOT_RUNNING  

And tasks are full of failed BalanceVm.

Version-Release number of selected component (if applicable):
rhevm-4.1.6.2-0.1.el7.noarch
rhvm-4.2.0-0.6.el7.noarch (reproduced)

How reproducible:
100%

Steps to Reproduce:
1. Configure positive host affinity for 1 VM that is down
2. Check engine.log and tasks tab.

Actual results:
Flooded with errors

Expected results:
Don't balance (migrate) VMs that are down.

Additional info:

Inpecting the code, it looks like in function getVmToHostsAffinityGroupCandidates, especifically here:

         } else if (!affHosts.contains(vm.getRunOnVds()) && g.isVdsPositive()) {                                                                                                                                                        
             // Positive affinity violated                                                                                                                                                                                              
             vmToHostsAffinityMap.put(vm_id,                                                                                                                                                                                            
                     1 + vmToHostsAffinityMap.getOrDefault(vm_id, 0));                                                                                                                                                                  
         }  

affHosts is a Set and when vm is down getRunOnVds returns null. 
affHosts does not contain a null element so affinity is falsely violated? 
Maybe down vms need to be filtered to start with?

Comment 4 Polina 2018-02-19 10:07:59 UTC
verified in rhvm-4.2.2-0.1.el7.noarch

created 3 affinity groups for 3 VMs (see xml below).

Tried scenarios:
1. create affinity group for powered off VM. check in engine.log that there are no attempts to migrate. 
2. create affinity group for running VM. Then Power Off and check engine.log for no errors.

        <affinity_group>
        <name>af_group3</name>
        <hosts_rule>
            <enabled>true</enabled>
            <enforcing>true</enforcing>
            <positive>true</positive>
        </hosts_rule>
        <vms_rule>
            <enabled>false</enabled>
            <enforcing>true</enforcing>
            <positive>false</positive>
        </vms_rule>
        <hosts>
            <host href="/ovirt-engine/api/hosts/b97f27e5-1307-4dfa-a285-6fad766ebe82" id="b97f27e5-1307-4dfa-a285-6fad766ebe82"/>
        </hosts>
        <vms>
            <vm href="/ovirt-engine/api/vms/8b60e9a2-c834-4f47-a30b-dbb2e6d8f07b" id="8b60e9a2-c834-4f47-a30b-dbb2e6d8f07b"/>
        </vms>
        </affinity_group>
        
        <affinity_group>
        <name>af_group1</name>
        <hosts_rule>
            <enabled>true</enabled>
            <enforcing>false</enforcing>
            <positive>true</positive>
        </hosts_rule>
        <positive>true</positive>
        <vms_rule>
            <enabled>true</enabled>
            <enforcing>false</enforcing>
            <positive>true</positive>
        </vms_rule>
        <hosts>
            <host href="/ovirt-engine/api/hosts/9a50c448-61a1-4085-bfd1-62a6ee0b5525" id="9a50c448-61a1-4085-bfd1-62a6ee0b5525"/>
        </hosts>
        <vms>
            <vm href="/ovirt-engine/api/vms/0cded25e-63ef-43ed-996c-9cfc1934d37a" id="0cded25e-63ef-43ed-996c-9cfc1934d37a"/>
        </vms>
        </affinity_group>
        
        <affinity_group>
        <name>af_group2</name>
        <hosts_rule>
            <enabled>true</enabled>
            <enforcing>false</enforcing>
            <positive>true</positive>
        </hosts_rule>
        <positive>true</positive>
        <vms_rule>
            <enabled>true</enabled>
            <enforcing>false</enforcing>
            <positive>true</positive>
        </vms_rule>
        <hosts>
            <host href="/ovirt-engine/api/hosts/9a50c448-61a1-4085-bfd1-62a6ee0b5525" id="9a50c448-61a1-4085-bfd1-62a6ee0b5525"/>
        </hosts>
        <vms>
            <vm href="/ovirt-engine/api/vms/206a53a3-d04a-4e0c-83c5-f76b9757af40" id="206a53a3-d04a-4e0c-83c5-f76b9757af40"/>
        </vms>
        </affinity_group>

Comment 7 RHV bug bot 2018-03-16 15:02:58 UTC
INFO: Bug status (VERIFIED) wasn't changed but the folowing should be fixed:

[Tag 'ovirt-engine-4.2.2.4' doesn't contain patch 'https://gerrit.ovirt.org/87320']
gitweb: https://gerrit.ovirt.org/gitweb?p=ovirt-engine.git;a=shortlog;h=refs/tags/ovirt-engine-4.2.2.4

For more info please contact: rhv-devops@redhat.com

Comment 11 errata-xmlrpc 2018-05-15 17:47:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:1488

Comment 12 Franta Kust 2019-05-16 13:06:19 UTC
BZ<2>Jira Resync

Comment 13 Daniel Gur 2019-08-28 13:13:21 UTC
sync2jira

Comment 14 Daniel Gur 2019-08-28 13:17:34 UTC
sync2jira


Note You need to log in before you can comment on or make changes to this bug.