+++ This bug is a downstream clone. The original bug is: +++ +++ bug 1537343 +++ ====================================================================== Description of problem: Logs spammed with: 2018-01-23 00:48:05,541Z WARN [org.ovirt.engine.core.bll.BalanceVmCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-45) [ec22cb2] Validation of action 'BalanceVm' failed for user SYSTEM. Reasons: VAR__ACTION__MIGRATE,VAR__TYPE__VM,ACTION_TYPE_FAILED_VM_IS_NOT_RUNNING And tasks are full of failed BalanceVm. Version-Release number of selected component (if applicable): rhevm-4.1.6.2-0.1.el7.noarch rhvm-4.2.0-0.6.el7.noarch (reproduced) How reproducible: 100% Steps to Reproduce: 1. Configure positive host affinity for 1 VM that is down 2. Check engine.log and tasks tab. Actual results: Flooded with errors Expected results: Don't balance (migrate) VMs that are down. Additional info: Inpecting the code, it looks like in function getVmToHostsAffinityGroupCandidates, especifically here: } else if (!affHosts.contains(vm.getRunOnVds()) && g.isVdsPositive()) { // Positive affinity violated vmToHostsAffinityMap.put(vm_id, 1 + vmToHostsAffinityMap.getOrDefault(vm_id, 0)); } affHosts is a Set and when vm is down getRunOnVds returns null. affHosts does not contain a null element so affinity is falsely violated? Maybe down vms need to be filtered to start with? (Originally by Germano Veit Michel)
verified in rhvm-4.2.2-0.1.el7.noarch created 3 affinity groups for 3 VMs (see xml below). Tried scenarios: 1. create affinity group for powered off VM. check in engine.log that there are no attempts to migrate. 2. create affinity group for running VM. Then Power Off and check engine.log for no errors. <affinity_group> <name>af_group3</name> <hosts_rule> <enabled>true</enabled> <enforcing>true</enforcing> <positive>true</positive> </hosts_rule> <vms_rule> <enabled>false</enabled> <enforcing>true</enforcing> <positive>false</positive> </vms_rule> <hosts> <host href="/ovirt-engine/api/hosts/b97f27e5-1307-4dfa-a285-6fad766ebe82" id="b97f27e5-1307-4dfa-a285-6fad766ebe82"/> </hosts> <vms> <vm href="/ovirt-engine/api/vms/8b60e9a2-c834-4f47-a30b-dbb2e6d8f07b" id="8b60e9a2-c834-4f47-a30b-dbb2e6d8f07b"/> </vms> </affinity_group> <affinity_group> <name>af_group1</name> <hosts_rule> <enabled>true</enabled> <enforcing>false</enforcing> <positive>true</positive> </hosts_rule> <positive>true</positive> <vms_rule> <enabled>true</enabled> <enforcing>false</enforcing> <positive>true</positive> </vms_rule> <hosts> <host href="/ovirt-engine/api/hosts/9a50c448-61a1-4085-bfd1-62a6ee0b5525" id="9a50c448-61a1-4085-bfd1-62a6ee0b5525"/> </hosts> <vms> <vm href="/ovirt-engine/api/vms/0cded25e-63ef-43ed-996c-9cfc1934d37a" id="0cded25e-63ef-43ed-996c-9cfc1934d37a"/> </vms> </affinity_group> <affinity_group> <name>af_group2</name> <hosts_rule> <enabled>true</enabled> <enforcing>false</enforcing> <positive>true</positive> </hosts_rule> <positive>true</positive> <vms_rule> <enabled>true</enabled> <enforcing>false</enforcing> <positive>true</positive> </vms_rule> <hosts> <host href="/ovirt-engine/api/hosts/9a50c448-61a1-4085-bfd1-62a6ee0b5525" id="9a50c448-61a1-4085-bfd1-62a6ee0b5525"/> </hosts> <vms> <vm href="/ovirt-engine/api/vms/206a53a3-d04a-4e0c-83c5-f76b9757af40" id="206a53a3-d04a-4e0c-83c5-f76b9757af40"/> </vms> </affinity_group> (Originally by Polina Agranat)
verified on rhv-release-4.1.10-5-001.noarch. Run the following scenarios. No errors in engine.log Create affinity group with: - two VMs - Up and Down, the stop the running VM. - two VMs Down the following groups: - VM affinity positive, Host affinity positive, Enforcing true/false. - VM affinity negative, Host affinity negative, Enforcing true/false. - VM affinity positive, Host affinity negative, Enforcing true/false. - VM affinity negative, Host affinity positive, Enforcing true/false.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0562
BZ<2>Jira Resync
sync2jira