Bug 1537414 - Engine trying to assign MAC addresses which are in use by VMs in a different cluster
Summary: Engine trying to assign MAC addresses which are in use by VMs in a different ...
Keywords:
Status: CLOSED DUPLICATE of bug 1593800
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Network
Version: 4.2.1.1
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ovirt-4.3.0
: ---
Assignee: Leon Goldberg
QA Contact: Michael Burman
URL:
Whiteboard:
Depends On: 1492577 1593800 1639460 1767319
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-01-23 06:52 UTC by Michael Burman
Modified: 2019-10-31 07:38 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-08-08 08:49:12 UTC
oVirt Team: Network
Embargoed:
rule-engine: ovirt-4.3+
rule-engine: blocker+


Attachments (Terms of Use)
engine log (297.99 KB, application/x-gzip)
2018-01-23 06:52 UTC, Michael Burman
no flags Details

Description Michael Burman 2018-01-23 06:52:10 UTC
Created attachment 1384689 [details]
engine log

Description of problem:
Engine trying to assign MAC addresses which are in use by VMs.

I have seen this behaviour for some time now, couldn't understand exactly what is happening. In some cases engine was trying to assign on the first attempt MAC address which is in use by a VM and on the second attempt a free MAC address is assigned.

I saw this behaviour on import VM from data domain(DR flow), when engine is trying to assign MAC addresses which are in use and the vNIC is imported a unplugged cause of it. 
I was afraid that this behaviour is caused by the wrong behaviour of the MAC addresses per cluster feature - BZ 1492577

But now i'm believe that it's a different engine bug and it's easy to reproduce

Version-Release number of selected component (if applicable):
4.2.1.2-0.1.el7

How reproducible:
100%

Steps to Reproduce: 
1. Create 3 VMs with 1 vNIC each
2. Create MAC pool range in the cluster with 20 MAC addresses
3. Remove from vm1 the nic and add new one, go to vm2 and do the same, as well on vm3, in some point engine will start assignoing MAC addresses which are in use by the other VM/s, if trying to add the nic on second attempt, engine then will assign a free MAC address.

Actual results:
018-01-23 08:49:00,344+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-16) [6eb18838] EVENT_ID: NETWORK_ADD_VM_INTERFACE_FAILED(933), Failed to add Interface nic1 (VirtIO) to VM V3. (User: admin@internal-authz)

Expected results:
Engine shouldn't try to assign MAC addresses which are in use.
This bug is affecting the DR flow for VM import + re-assign bad MAC.

Additional info:
Not sure if it's a regression or not, possible that such behaviour was always there.

Comment 1 Michael Burman 2018-01-30 09:48:50 UTC
Further investigation shows that this bug affected directly by the wrong implementation in BZ 1492577.

If it's only one cluster with 1 MAC pool range everything behaves as expected.
Problems occurs when you have 2 clusters, with different MAC pools, but some of the MACs are in the same range. 

Example - 2 clusters, each with it's own MAC pool range, but has some equal MAC addresses in the range.

So if you have a VM1 in cluster1 and you add a vNIC to it, then when you will try to add a vNIC to VM2 in cluster2, engine will try to assign the same MAC address as he assigned to VM1 and then he will show warning that the MAC is already in use.

Comment 3 Michael Burman 2018-05-01 06:13:15 UTC
Doesn't exist in 4.1, only 4.2

Comment 4 Alona Kaplan 2018-05-01 12:32:16 UTC
The original design of the mac pool feature is that each mac pool manages its macs.
It means that different mac pools can have vms with the same mac addresses.

The bug is that in ActivateVm command (a command which is called when the vnic is marked as 'plugged') we validate there is not mac in the whole SYSTEM identical to the mac of the updated/created vnic.
It means that even if the mac doesn't exist in the relevant mac pool (which is the case) we can get a validation error if the mac is used in another mac pool.

The bug can be easily fixed by changing the validation in activateVm command to check only the relevant mac pool.
But maybe the design was wrong in the first place and this bug safes us from having duplicate macs in the system...

There are some ways to make the bug to 'appear'.
1. Having mac pools with overlapping ranges.
2. Adding custom macs from the range of another mac pool.
3. Moving vm to a different cluster (with different mac pool).
4. Changing the mac pool of a cluster that already have vms with allocated macs from the old mac pool. (This is what happened in the tested environment),

BTW, I managed to reproduce the bug in 4.1.


The are several possible solutions to the problem.
1. Redesign the feature to have only one mac pool per the whole system. Only adding ranges per cluster.
2. Leave the original design as is.
  2.1 Fix the validation in ActivateVm command to check only in the relevant mac pool and not the whole system.
  2.2 Adding a warning when doing actions that may cause the bug to appear (creating new mac pool with overlapping range, etc.)
  2.3 Adding and audit log that warns the user (once a day?) if there are duplicate macs in the system.

Comment 5 Alona Kaplan 2018-05-01 12:32:56 UTC
(In reply to Michael Burman from comment #3)
> Doesn't exist in 4.1, only 4.2

I mananged to reproduce it in 4.1.

Comment 6 Michael Burman 2018-05-01 14:30:51 UTC
(In reply to Alona Kaplan from comment #5)
> (In reply to Michael Burman from comment #3)
> > Doesn't exist in 4.1, only 4.2
> 
> I mananged to reproduce it in 4.1.

Nice, then i'm removing the regression flag. Thanks for checking me on that))

Comment 7 Beni Pelled 2018-05-06 13:37:11 UTC
I don't like the idea of "Allow Duplicates" - it may create a lot of trouble and mess - and therefore I think that the first solution (a single mac-pool to the whole system with the possibility of assigning ranges to clusters) is the right solution,
Of course we need to make sure that there is NO possibility to set two overlapping ranges, and the association of range to specific cluster should be from the same screen as the mac-pool management.

If someone insists that he needs an environment with the ability to set duplicate macs, we can add this as an option in 'engine-config' tool (instead of the check-box that exists today) so that it will not confuse anyone and no one will start it by mistake,

In addition we need to take into account cases of moving VMs between clusters or changing existing range settings etc.

Comment 8 eraviv 2018-08-08 08:49:12 UTC

*** This bug has been marked as a duplicate of bug 1593800 ***


Note You need to log in before you can comment on or make changes to this bug.