Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1537343 - engine tries to balance vms that are down.
engine tries to balance vms that are down.
Status: CLOSED ERRATA
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine (Show other bugs)
4.1.6
x86_64 Linux
high Severity medium
: ovirt-4.2.2
: ---
Assigned To: Andrej Krejcir
Polina
: ZStream
Depends On:
Blocks: 1551582
  Show dependency treegraph
 
Reported: 2018-01-22 20:04 EST by Germano Veit Michel
Modified: 2018-05-15 13:48 EDT (History)
12 users (show)

See Also:
Fixed In Version: ovirt-engine-4.2.2
Doc Type: Bug Fix
Doc Text:
Previously, if a virtual machine with a strong positive affinity to a host was down, the affinity rules enforcer tried to migrate it, because it was not running on the specified host. When migration failed, the affinity rules enforcer tried repeatedly to migrate the same virtual machine, ignoring other virtual machines that violated affinity. In the current release, the affinity rules enforcer ignores virtual machines that are down.
Story Points: ---
Clone Of:
: 1551582 (view as bug list)
Environment:
Last Closed: 2018-05-15 13:47:24 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: SLA
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3329091 None None None 2018-01-22 20:34 EST
oVirt gerrit 87259 master MERGED core: Check affinity rules only for running VMs 2018-02-08 05:24 EST
oVirt gerrit 87320 ovirt-engine-4.2 MERGED core: Check affinity rules only for running VMs 2018-02-08 09:11 EST
Red Hat Product Errata RHEA-2018:1488 None None None 2018-05-15 13:48 EDT

  None (edit)
Description Germano Veit Michel 2018-01-22 20:04:36 EST
Description of problem:

Logs spammed with:

2018-01-23 00:48:05,541Z WARN  [org.ovirt.engine.core.bll.BalanceVmCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-45) [ec22cb2] Validation of action 'BalanceVm' failed for user SYSTEM. Reasons: VAR__ACTION__MIGRATE,VAR__TYPE__VM,ACTION_TYPE_FAILED_VM_IS_NOT_RUNNING  

And tasks are full of failed BalanceVm.

Version-Release number of selected component (if applicable):
rhevm-4.1.6.2-0.1.el7.noarch
rhvm-4.2.0-0.6.el7.noarch (reproduced)

How reproducible:
100%

Steps to Reproduce:
1. Configure positive host affinity for 1 VM that is down
2. Check engine.log and tasks tab.

Actual results:
Flooded with errors

Expected results:
Don't balance (migrate) VMs that are down.

Additional info:

Inpecting the code, it looks like in function getVmToHostsAffinityGroupCandidates, especifically here:

         } else if (!affHosts.contains(vm.getRunOnVds()) && g.isVdsPositive()) {                                                                                                                                                        
             // Positive affinity violated                                                                                                                                                                                              
             vmToHostsAffinityMap.put(vm_id,                                                                                                                                                                                            
                     1 + vmToHostsAffinityMap.getOrDefault(vm_id, 0));                                                                                                                                                                  
         }  

affHosts is a Set and when vm is down getRunOnVds returns null. 
affHosts does not contain a null element so affinity is falsely violated? 
Maybe down vms need to be filtered to start with?
Comment 4 Polina 2018-02-19 05:07:59 EST
verified in rhvm-4.2.2-0.1.el7.noarch

created 3 affinity groups for 3 VMs (see xml below).

Tried scenarios:
1. create affinity group for powered off VM. check in engine.log that there are no attempts to migrate. 
2. create affinity group for running VM. Then Power Off and check engine.log for no errors.

        <affinity_group>
        <name>af_group3</name>
        <hosts_rule>
            <enabled>true</enabled>
            <enforcing>true</enforcing>
            <positive>true</positive>
        </hosts_rule>
        <vms_rule>
            <enabled>false</enabled>
            <enforcing>true</enforcing>
            <positive>false</positive>
        </vms_rule>
        <hosts>
            <host href="/ovirt-engine/api/hosts/b97f27e5-1307-4dfa-a285-6fad766ebe82" id="b97f27e5-1307-4dfa-a285-6fad766ebe82"/>
        </hosts>
        <vms>
            <vm href="/ovirt-engine/api/vms/8b60e9a2-c834-4f47-a30b-dbb2e6d8f07b" id="8b60e9a2-c834-4f47-a30b-dbb2e6d8f07b"/>
        </vms>
        </affinity_group>
        
        <affinity_group>
        <name>af_group1</name>
        <hosts_rule>
            <enabled>true</enabled>
            <enforcing>false</enforcing>
            <positive>true</positive>
        </hosts_rule>
        <positive>true</positive>
        <vms_rule>
            <enabled>true</enabled>
            <enforcing>false</enforcing>
            <positive>true</positive>
        </vms_rule>
        <hosts>
            <host href="/ovirt-engine/api/hosts/9a50c448-61a1-4085-bfd1-62a6ee0b5525" id="9a50c448-61a1-4085-bfd1-62a6ee0b5525"/>
        </hosts>
        <vms>
            <vm href="/ovirt-engine/api/vms/0cded25e-63ef-43ed-996c-9cfc1934d37a" id="0cded25e-63ef-43ed-996c-9cfc1934d37a"/>
        </vms>
        </affinity_group>
        
        <affinity_group>
        <name>af_group2</name>
        <hosts_rule>
            <enabled>true</enabled>
            <enforcing>false</enforcing>
            <positive>true</positive>
        </hosts_rule>
        <positive>true</positive>
        <vms_rule>
            <enabled>true</enabled>
            <enforcing>false</enforcing>
            <positive>true</positive>
        </vms_rule>
        <hosts>
            <host href="/ovirt-engine/api/hosts/9a50c448-61a1-4085-bfd1-62a6ee0b5525" id="9a50c448-61a1-4085-bfd1-62a6ee0b5525"/>
        </hosts>
        <vms>
            <vm href="/ovirt-engine/api/vms/206a53a3-d04a-4e0c-83c5-f76b9757af40" id="206a53a3-d04a-4e0c-83c5-f76b9757af40"/>
        </vms>
        </affinity_group>
Comment 7 RHV Bugzilla Automation and Verification Bot 2018-03-16 11:02:58 EDT
INFO: Bug status (VERIFIED) wasn't changed but the folowing should be fixed:

[Tag 'ovirt-engine-4.2.2.4' doesn't contain patch 'https://gerrit.ovirt.org/87320']
gitweb: https://gerrit.ovirt.org/gitweb?p=ovirt-engine.git;a=shortlog;h=refs/tags/ovirt-engine-4.2.2.4

For more info please contact: rhv-devops@redhat.com
Comment 11 errata-xmlrpc 2018-05-15 13:47:24 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:1488

Note You need to log in before you can comment on or make changes to this bug.