Bug 1651747 - All virtual machines in an affinity group do not run on same host after migrating one of the VMs with VM Soft Affinity.
Summary: All virtual machines in an affinity group do not run on same host after migra...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.2.7
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ovirt-4.3.5
: ---
Assignee: Andrej Krejcir
QA Contact: Polina
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-11-20 16:45 UTC by Abhishekh Patil
Modified: 2019-08-28 13:16 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
In this release, the Affinity Enforcement process now includes Soft Virtual Machine Affinity.
Clone Of:
Environment:
Last Closed: 2019-08-12 11:53:27 UTC
oVirt Team: Virt
Target Upstream Version:


Attachments (Terms of Use)
engine with debug (264.81 KB, text/plain)
2019-06-23 12:59 UTC, Polina
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3891001 None None None 2019-02-06 12:52:37 UTC
Red Hat Product Errata RHEA-2019:2431 None None None 2019-08-12 11:53:39 UTC
oVirt gerrit 96057 'None' 'MERGED' 'core: Small optimizations in AffinityRulesEnforcer' 2019-12-02 10:04:00 UTC
oVirt gerrit 96058 'None' 'ABANDONED' 'core: Affinity enforcer computes blacklists and whitelists for migration' 2019-12-02 10:04:00 UTC
oVirt gerrit 96059 'None' 'MERGED' 'core: Affinity rules enforcer checks soft vm-to-vm affinity' 2019-12-02 10:03:59 UTC
oVirt gerrit 96060 'None' 'ABANDONED' 'scheduler: Increase the factor of VmAffinityWeightPolicyUnit' 2019-12-02 10:03:59 UTC
oVirt gerrit 99181 'None' 'MERGED' 'scheduler: Fix logic in VmToHostAffinityPolicyUnit' 2019-12-02 10:03:58 UTC
oVirt gerrit 99182 'None' 'MERGED' 'core: Affinity rules enforcer tries to migrate multiple VMs until success' 2019-12-02 10:03:59 UTC
oVirt gerrit 99706 'None' 'MERGED' 'core: Affinity enforcer checks if migration would improve situation' 2019-12-02 10:03:58 UTC
oVirt gerrit 99805 'None' 'MERGED' 'scheduler: Add builder for calling schedule() and canSchedule()' 2019-12-02 10:03:59 UTC
oVirt gerrit 100119 'None' 'MERGED' 'core: Small optimizations in AffinityRulesEnforcer' 2019-12-02 10:04:01 UTC
oVirt gerrit 100120 'None' 'MERGED' 'scheduler: Fix logic in VmToHostAffinityPolicyUnit' 2019-12-02 10:04:01 UTC
oVirt gerrit 100121 'None' 'MERGED' 'core: Affinity rules enforcer tries to migrate multiple VMs until success' 2019-12-02 10:04:00 UTC
oVirt gerrit 100122 'None' 'MERGED' 'core: Affinity rules enforcer checks soft vm-to-vm affinity' 2019-12-02 10:04:00 UTC
oVirt gerrit 100123 'None' 'MERGED' 'scheduler: Add builder for calling schedule() and canSchedule()' 2019-12-02 10:04:00 UTC
oVirt gerrit 100124 'None' 'MERGED' 'core: Affinity enforcer checks if migration would improve situation' 2019-12-02 10:04:00 UTC

Description Abhishekh Patil 2018-11-20 16:45:41 UTC
Description of problem:

Virtual machine does not migrate after enabling VM-to-VM soft Affinity (Positive VM Polarity). If we migrate one VM to other host, other VM in the affinity group does not migrate even after very long time.

Version-Release number of selected component (if applicable):

ovirt-engine-4.2.7

How reproducible:
[100%]

Steps to Reproduce:

1) Create new Scheduling Policy for the Cluster (Administration -> Configure -> Scheduling Policies -> vm_evenly_distributed -> Copy (as Affinity_testing) -> Affinity_testing -> Edit:

Leave Filter Modules as-is, remove all but VmAffinityGroups Weight Module, set its Factor to 100, disable Load Balancer, leave Properties as-is.

2) Select the Policy for the Cluster from Compute -> Clusters -> Default -> Edit -> Scheduling Policy -> Affinity_testing and leave everything as defaults.

3) Create two VMs on Compute -> Virtual Machines and in the VM view check that both have under Affinity Groups an Affinity Group with VM Affinity Rule Positive (non-Enforcing) and Host Affinity Rule Disabled, both VMs listed under Virtual Machines.

4) Shutdown both VMs, start them both, notice how they start on the same Host. Migrate either of them to another Host and notice how the other VM does not follow it even after a long period of time.



Actual results:

VMs in an affinity group does not migrate even after migrating one VM from the affinity group to another host. 

Expected results:

The following documentation says "Soft enforcement - indicates a preference for virtual machines in an affinity group to run on the specified host or hosts in a group when possible." 

https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.2/html-single/virtual_machine_management_guide/#sect-Affinity_Groups

So, all VMs in an affinity group should run on same host or hosts in a group when possible.

Comment 1 Ryan Barry 2018-11-21 00:27:55 UTC
This still appears to be reproducible (I encountered it today)

Yanir, are there any changes to the default scheduling policy needed to make this work as expected? The policies I looked at today already had values for affinity set to priority 10

Comment 2 Andrej Krejcir 2018-11-21 10:42:45 UTC
Looking at the code, the component responsible for migrating VMs that break affinity rules (AffinityRulesEnforcementManager) does not consider soft VM-VM affinity when choosing a VM to migrate. I think this is a bug, AREM should try to correct soft VM-VM affinity too.

Currently the soft VM-VM affinity is only used when starting or migrating a VM and during balancing. But the balancing chooses which VM to migrate based on CPU and memory load, not affinity rules. So in case the hosts are not overloaded, the VMs will not be migrated even if they break soft affinity.

Comment 16 Gajanan 2019-02-20 11:09:56 UTC
Can someone help to check whether this bug is similar to this bug: Bug 1678708

Comment 17 Andrej Krejcir 2019-02-20 13:42:02 UTC
This bug is about enabling the automatic migration which fixes broken affinity groups even for soft VM affinity groups.
And Bug 1678708 looks to be about VM soft affinity not applying when starting VMs.

These bugs are different.

Comment 22 Polina 2019-06-23 12:58:38 UTC
Hi Andrej, 

I'm testing on ovirt-engine-4.3.5.1-0.1.el7.noarch. according to the bz the patch must be there . But I see some unexpected behavior.

Sometimes I migrate the VM1 (placed in soft positive VMs affinity rule) and the VM2 is balanced as expected (because of the affinity rule) to the same new host to be together with VM1.
But in most cases I see another behavior:
I migrate the VM1 and wait for the VM2 to be balanced. Instead, though I see that the VM1 is balanced back to the source host.

like this :
Jun 23, 2019, 3:40:04 PM Migration started (VM: golden_env_mixed_virtio_0, Source: host_mixed_1, Destination: host_mixed_2, User: admin@internal-authz). 
Jun 23, 2019, 3:40:27 PM Migration completed (VM: golden_env_mixed_virtio_0, Source: host_mixed_1, Destination: host_mixed_2, Duration: 12 seconds, Total: 22 seconds, Actual downtime: 59ms)
Jun 23, 2019, 3:40:57 PM Migration initiated by system (VM: golden_env_mixed_virtio_0, Source: host_mixed_2, Destination: host_mixed_1, Reason: Affinity rules enforcement)
Jun 23, 2019, 3:41:12 PM Migration completed (VM: golden_env_mixed_virtio_0, Source: host_mixed_2, Destination: host_mixed_1, Duration: 4 seconds, Total: 14 seconds, Actual downtime: 63ms)

I attach engine log with scheduler debug . Please see example  starting from the line 
2019-06-23 15:40:04,502+03 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-3) [5a79e5ef-c2aa-409d-b606-c26b6f778088] EVENT_ID: VM_MIGRATION_START(62), Migration started (VM: golden_env_mixed_virtio_0, Source: host_mixed_1, Destination: host_mixed_2, User: admin@internal-authz).
Two VMs are running on host_mixed_1 . I migrate VM golden_env_mixed_virtio_0 to the host_mixed_2 . It succeeds and then after 30 sec this VM is balanced back instead of expected the second VM golden_env_mixed_virtio_1 to be migrated to the host host_mixed_2.

Could you please look?

Comment 23 Polina 2019-06-23 12:59:21 UTC
Created attachment 1583695 [details]
engine with debug

Comment 24 Andrej Krejcir 2019-06-24 07:34:46 UTC
That is the expected behavior. The affinity enforcement just tries to migrate VMs such that the affinity is not broken.
It does not matter if VM2 migrates to VM1 or VM1 migrates back to VM2, both cases are ok, since the affinity is fixed.

In case the user does not want VM1 to migrate back, its migration mode should be set to "manual migration only".

Comment 25 Polina 2019-06-24 10:52:08 UTC
verified on the base of https://bugzilla.redhat.com/show_bug.cgi?id=1651747#c22 and #c24

Comment 29 errata-xmlrpc 2019-08-12 11:53:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:2431

Comment 30 Daniel Gur 2019-08-28 13:12:32 UTC
sync2jira

Comment 31 Daniel Gur 2019-08-28 13:16:45 UTC
sync2jira


Note You need to log in before you can comment on or make changes to this bug.