Created attachment 1536308 [details]
soft affinity rule
Description of problem:
Customer reported issue as below:
I have 3 servers in an soft anti-affinity group. This still leads to two of those VMs being started on the same server.
Expectation: as long as there are hardware hypervisors with plenty of free resources the anti-affinity would work
I tried to reproduce it at my end, the soft affinity indeed does not work as expected.
See the attached screenshots for the settings and the observations in my test environment.
I have three hosts in my environment, and two vms.
I configured soft affinity where:
Vm affinity rule > Negative
Host affinity rule > Negative
Virtual Machines : Hosts
test2 RHV420H2 <<<< test2 should start on RHV420H2
testz root <<< testz should start on root
But vms does not start as expected, sometimes it does start as above and sometimes it does not.
Created attachment 1536321 [details]
Created attachment 1536322 [details]
Customers Environment is: rhvm-220.127.116.11-0.1.el7ev.noarch
My test environment is: rhvm-18.104.22.168-0.1.el7ev.noarch
The description here is unclear, and it's almost certainly a duplicate of either https://bugzilla.redhat.com/show_bug.cgi?id=1651747 or https://bugzilla.redhat.com/show_bug.cgi?id=1594810
Soft affinity is a best effort, and hard requirement should use hard affinity. In any case, there will be scheduler improvements in 4.3 which may mitigate this, but it's always going to be "best effort" support
Let me know what details you need here.
There are really 2:
1) why not hard affinity?
2) how does this differ from https://bugzilla.redhat.com/show_bug.cgi?id=1651747 ?
In general, bugs like "Foo does not work" or "improve Foo" don't give us a clear idea of what to work on. Your initial comment contains the expected result, but it's quite broad, and still appears to be a duplicate. Why not attach the case to the other bug?
It looks like there are two issues here:
1. In the screenshot of the affinity group, the host affinity is set to 'Negative', which means that the VMs should NOT run on any of the selected hosts.
2. The default importance of VM soft affinity is low. When starting a VM, the soft affinity may be overridden by other weight modules, for example CPU load or free memory.
The importance can be increased by creating a custom scheduling policy and increasing the factor for "VmAffinityGroups" weight module.
This does not look like a bug. Increasing the factor of the weight module should make the scheduler prefer soft affinity over CPU load or memory load.
1) I have tested again with host affinity set to 'Positive', it still does not work. Attaching screenshot of it to the bug.
2) There is no load on the hosts here. So still that should be under consideration ?
Created attachment 1536712 [details]
soft affinity rule with host affinity positive
Created attachment 1536714 [details]
VM start on wrong hosts when host affinity positive
comment #7 is still relevant then. Please include logs (in debug, ideally) when you see it's not working as expected.
Which all logs are required here ?
Engine logs, hopefully in debug
I am unable to find a way to get logs in debug for RHV 4.x version
I found one but its for RHEV 3.x https://access.redhat.com/solutions/435333
These can now be found in /usr/share/ovirt-engine/services/ovirt-engine
ovirt-engine-logging.properties.in is the easiest way to do it