Hide Forgot
Description of problem: When migrating a VM, the scheduler subtracts the CPU load of the VM from the host where it is currently running. So that the host can appear as if the VM is not running there. This calculation is incorrect when the cluster does not have option 'count threads as cores' set. Version-Release number of selected component (if applicable): 4.3 and 4.2 Steps to Reproduce: 1. Unset the 'count threads as cores' option for the cluster. 2. Have 2 hosts (host1, host2) with 2 threads per CPU core. 3. Run 2 VMs (VM1, VM2) on host1. With the same number of cores as the host, but 1 thread per core. 4. Add VM1 and host1 to an affinity group, positive, enforcing. 5. Create 100% CPU load on VM2. (for example by running: python -c 'while True: pass') 6. Enable EvenDistribution balancing on the cluster. Set HighUtilization to 50. 7. Wait to see if the VM2 will be migrated to host2 by the balancing. Actual results: VM2 is not migrated. Scheduler incorrectly computes the CPU load of host1, so the host is considered the best candidate for migration and the VM does not migrate. Expected results: VM2 migrates to host2.
Verification failed on: ovirt-engine-4.3.0-0.4.master.20190103151009.git5251adc.el7.noarch Steps: 1. Unset the 'count threads as cores' option for the cluster. 2. Have 2 hosts (host1, host2) with 2 threads per CPU core. 3. Run 2 VMs (VM1, VM2) on host1. With the same number of cores as the host, but 1 thread per core. 4. Add VM1 and host1 to an affinity group, positive, enforcing. 5. Create 100% CPU load on VM2. (for example by running: python -c 'while True: pass') 6. Enable EvenDistribution balancing on the cluster. Set HighUtilization to 50. 7. Wait to see if the VM2 will be migrated to host2 by the balancing. Actual results: VM2 is not migrated. Expected results: VM2 migrates to host2. Additional info: I used HE environment. At start point the 3 VMs were on Host1. The HE VM have 4vcpu, as cores. Each host have 24 cores and 2 threads per core. The scheduler migrated the HE VM first. The load on the host stayed at 50%. Nothing continued afterwards(just a loop). In this state, setting the "count threads as cores" immediately caused VM2 to migrate.
Re-targeting to 4.3.1 since it is missing a patch, an acked blocker flag, or both
Verification failed on: ovirt-engine-4.3.2-0.1.el7.noarch Steps: 1. Unset the 'count threads as cores' option for the cluster. 2. Have 2 hosts (host1, host2) with 2 threads per CPU core. 3. Run 2 VMs (VM1, VM2) on host1. With the same number of cores as the host, but 1 thread per core. 4. Add VM1 and host1 to an affinity group, positive, enforcing. 5. Create 100% CPU load on VM2. (for example by running: python -c 'while True: pass') 6. Enable EvenDistribution balancing on the cluster. Set HighUtilization to 50. 7. Wait to see if the VM2 will be migrated to host2 by the balancing. Actual results: VM2 is not migrated. Expected results: VM2 migrates to host2. Additional info: I used HE environment. At start point the 3 VMs were on Host1. The HE VM have 4vcpu, as cores. Each host have 24 cores and 2 threads per core. The scheduler migrated the HE VM first. The load on the host stayed at 50%. Nothing continued afterwards(just a loop). Afterwards I tried to change VM2 (without affinity to host) load to 80% and loading VM1 to 25% load (total on host > 50%) Same result as above, no migration happened. Then, I set the 'count threads as cores' and immediately VM2 migrated away.
The version ovirt-engine-4.3.2-0.1.el7.noarch does not yet contain the patch that fixes this bug. It was released before the patch was merged to master. Please verify this on a newer version or on the latest master snapshot.
Verified on: ovirt-engine-4.3.2.1-0.0.master.20190305140204.git3649df7.el7.noarch Steps: 1. Unset the 'count threads as cores' option for the cluster. 2. Have 2 hosts (host1, host2) with 2 threads per CPU core. 3. Run 2 VMs (VM1, VM2) on host1. With the same number of cores as the host, but 1 thread per core. 4. Add VM1 and host1 to an affinity group, positive, enforcing. 5. Create 100% CPU load on VM2. (for example by running: python -c 'while True: pass') 6. Enable EvenDistribution balancing on the cluster. Set HighUtilization to 50. 7. Wait to see if the VM2 will be migrated to host2 by the balancing. Actual results: HE VM migrated first-from host1 to host2 (least load VM), afterwards VM2 migrated to host2.
This bugzilla is included in oVirt 4.3.2 release, published on March 19th 2019. Since the problem described in this bug report should be resolved in oVirt 4.3.2 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.