Bug 1435485 - [Pool] VMs are still created with duplicate MAC addresses after 4.0.7 upgrade
Summary: [Pool] VMs are still created with duplicate MAC addresses after 4.0.7 upgrade
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.0.7
Hardware: x86_64
OS: Linux
high
urgent
Target Milestone: ovirt-4.2.0
: ---
Assignee: Dan Kenigsberg
QA Contact: Michael Burman
URL:
Whiteboard:
Depends On: 1492723
Blocks: 1485688
TreeView+ depends on / blocked
 
Reported: 2017-03-23 23:46 UTC by Germano Veit Michel
Modified: 2021-08-30 11:54 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously, when creating large pools of virtual machines, the Engine assigned duplicate MAC addresses, causing network problems. In the current release, the problem has been solved by removing a problematic legacy method.
Clone Of:
: 1485688 (view as bug list)
Environment:
Last Closed: 2018-05-15 17:41:09 UTC
oVirt Team: Network
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 2775081 0 None None None 2017-03-24 00:47:28 UTC
Red Hat Product Errata RHEA-2018:1488 0 None None None 2018-05-15 17:42:35 UTC
oVirt gerrit 76309 0 master MERGED core: removing 'berserk' forceAddMac method 2020-10-08 08:54:08 UTC
oVirt gerrit 78132 0 master MERGED core: fixes & simplifications in MoveMacs.java 2020-10-08 08:54:08 UTC
oVirt gerrit 78146 0 master MERGED core: removal of addForce method from cluster and DC update. 2020-10-08 08:54:08 UTC
oVirt gerrit 78191 0 master MERGED core: simplified parameters passed to VmInterfaceManager#add 2020-10-08 08:54:09 UTC
oVirt gerrit 78192 0 master MERGED core: remove forceAddMac from VmInterfaceManager,ImportVmCommandBase 2020-10-08 08:54:10 UTC
oVirt gerrit 78193 0 master ABANDONED core: add validation for duplicate macs when importing vm 2020-10-08 08:54:10 UTC
oVirt gerrit 80450 0 master MERGED core: when initializing MacPool also register in it nics in snapshots 2020-10-08 08:54:10 UTC
oVirt gerrit 81477 0 master MERGED core: removed duplicity in code 2020-10-08 08:54:10 UTC
oVirt gerrit 83415 0 master MERGED packaging: Introduce helper to fix duplicate macs. 2020-10-08 08:54:10 UTC
oVirt gerrit 89524 0 master MERGED examples: Introduce example to fix duplicate macs. 2020-10-08 08:54:10 UTC
oVirt gerrit 89766 0 sdk_4.2 MERGED examples: Introduce example to fix duplicate macs. 2020-10-08 08:54:10 UTC

Description Germano Veit Michel 2017-03-23 23:46:12 UTC
Description of problem:

We have a report of the following BZ not being fixed by it's 4.0.7 clone:
BZ1400043 - [Vm Pool] VMs are created with duplicate MAC addresses

First try of new version (4.0.7) resulted in 12 VMs with Duplicate MACs.

Version-Release number of selected component (if applicable):
rhevm-4.0.7.4-0.1.el7ev.noarch

Comment 6 Germano Veit Michel 2017-03-24 05:44:22 UTC
Upgrade to 4.0.7 was from 4.0.6

Comment 7 Germano Veit Michel 2017-03-27 07:06:02 UTC
One interesting thing is that they have 5-6 VM Pools. Could this increase the probability of hitting the bug? It seems very easy to hit in that environment.

Comment 9 Alona Kaplan 2017-04-05 16:06:01 UTC
A possible reproduction of the bug -

-Make sure the MacPool used by the dc doesn't allow duplicates.

1. Create a template with one vnic ('tmp1').
2. Create VmPool ('pool') from 'tmp1' with 2 vms ('pool-1' and 'pool-2'). Set the number of prestarted vms as 2.
3. Wait for the vms to be up.
4. Unplug the nic from vm 'pool-1' (lets call its current mac address 'x'). Change its mac address (new mac 'y'). Plug it back.
5. Add a vnic to vm 'pool-2' and set its mac address to 'x' (the old mac address of the vnic we uplugged and plugged).
6. Stop vm 'pool-1'.

Result - Both vms 'pool-1' and 'pool-2' have vnic with 'x' mac.


Explanation of what causes the bug - when stopping a vm that was started by the pool, the original snapshot (before the run) is restored. The macs of the vnics in the original snapshot are added to the mac pool using 'forceAdd'. It means that it ignores if the mac is already in the pool.
So if a mac in the original snapshot was taken by another vm. We will end up with duplicate macs.

Comment 10 Germano Veit Michel 2017-04-24 01:44:15 UTC
Latest logs after a new test (with the snapshot related errors fixed) do not show the problem anymore.

I believe we are hitting the scenario Alona described, as the MAC Pool was close to been exhausted therefore the chances of another VM taking the MAC of the original snapshot were quite high.

Comment 25 Michael Burman 2017-09-18 14:13:33 UTC
This is failed QA, the result is the same a was befor ehte fix.
We don't reserve the origin MAC address for stateless VM and we can end up with duplicate MAC addresses because of that when the stateless VM is shutdown.

Comment 26 Dan Kenigsberg 2017-09-27 13:52:30 UTC
Rumor has it that currently merged code is fitting for QA.

Comment 27 Meni Yakove 2017-09-28 08:34:45 UTC
ovirt-engine-4.2.0-0.0.master.20170926175518.git0d20200.el7.centos.noarch

Comment 28 Michael Burman 2017-09-28 08:36:53 UTC
Verified on - 4.2.0-0.0.master.20170927183005.git49790b2.el7.centos

Comment 29 Michael Burman 2017-09-28 08:39:32 UTC
Summary and results:

Stateless scenarios - PASS
Statefull/snapshot scenarios - PASS
Regression - All new regression bugs which has been caused by the fix for this report has been verified
Tier 2 - PASS
MAC pool per cluster - no regression in the feature

Comment 33 RHV bug bot 2017-12-06 16:16:40 UTC
INFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Open patch attached]

For more info please contact: rhv-devops@redhat.com

Comment 34 RHV bug bot 2017-12-12 21:15:09 UTC
INFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Open patch attached]

For more info please contact: rhv-devops@redhat.com

Comment 35 RHV bug bot 2017-12-18 17:05:33 UTC
INFO: Bug status (VERIFIED) wasn't changed but the folowing should be fixed:

[Open patch attached]

For more info please contact: rhv-devops@redhat.com

Comment 37 Sandro Bonazzola 2018-04-18 11:17:47 UTC
Dan, this bug is verified but has an open patch attached: either drop the attached patch or move back this bug to POST.

Comment 38 Dan Kenigsberg 2018-04-24 13:58:21 UTC
This BZ has been (ab)used for examples on how to work around it. They have been merged 3 weeks ago, so it is not clear to me which patches you refer to.

Comment 41 errata-xmlrpc 2018-05-15 17:41:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:1488

Comment 42 Franta Kust 2019-05-16 13:05:04 UTC
BZ<2>Jira Resync


Note You need to log in before you can comment on or make changes to this bug.