1435485 – [Pool] VMs are still created with duplicate MAC addresses after 4.0.7 upgrade

Bug 1435485 - [Pool] VMs are still created with duplicate MAC addresses after 4.0.7 upgrade

Summary: [Pool] VMs are still created with duplicate MAC addresses after 4.0.7 upgrade

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Virtualization Manager
Classification:	Red Hat
Component:	ovirt-engine
Sub Component:
Version:	4.0.7
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	urgent
Target Milestone:	ovirt-4.2.0
Target Release:	---
Assignee:	Dan Kenigsberg
QA Contact:	Michael Burman
Docs Contact:
URL:
Whiteboard:
Depends On:	1492723
Blocks:	1485688
TreeView+	depends on / blocked

Reported:	2017-03-23 23:46 UTC by Germano Veit Michel
Modified:	2021-08-30 11:54 UTC (History)
CC List:	18 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	Previously, when creating large pools of virtual machines, the Engine assigned duplicate MAC addresses, causing network problems. In the current release, the problem has been solved by removing a problematic legacy method.
Clone Of:
Clones:	1485688 (view as bug list)
Environment:
Last Closed:	2018-05-15 17:41:09 UTC
oVirt Team:	Network
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Knowledge Base (Solution)	2775081	None	None	None	2017-03-24 00:47:28 UTC
Red Hat Product Errata	RHEA-2018:1488	None	None	None	2018-05-15 17:42:35 UTC
oVirt gerrit	76309	master	MERGED	core: removing 'berserk' forceAddMac method	2020-10-08 08:54:08 UTC
oVirt gerrit	78132	master	MERGED	core: fixes & simplifications in MoveMacs.java	2020-10-08 08:54:08 UTC
oVirt gerrit	78146	master	MERGED	core: removal of addForce method from cluster and DC update.	2020-10-08 08:54:08 UTC
oVirt gerrit	78191	master	MERGED	core: simplified parameters passed to VmInterfaceManager#add	2020-10-08 08:54:09 UTC
oVirt gerrit	78192	master	MERGED	core: remove forceAddMac from VmInterfaceManager,ImportVmCommandBase	2020-10-08 08:54:10 UTC
oVirt gerrit	78193	master	ABANDONED	core: add validation for duplicate macs when importing vm	2020-10-08 08:54:10 UTC
oVirt gerrit	80450	master	MERGED	core: when initializing MacPool also register in it nics in snapshots	2020-10-08 08:54:10 UTC
oVirt gerrit	81477	master	MERGED	core: removed duplicity in code	2020-10-08 08:54:10 UTC
oVirt gerrit	83415	master	MERGED	packaging: Introduce helper to fix duplicate macs.	2020-10-08 08:54:10 UTC
oVirt gerrit	89524	master	MERGED	examples: Introduce example to fix duplicate macs.	2020-10-08 08:54:10 UTC
oVirt gerrit	89766	sdk_4.2	MERGED	examples: Introduce example to fix duplicate macs.	2020-10-08 08:54:10 UTC

Description Germano Veit Michel 2017-03-23 23:46:12 UTC

Description of problem:

We have a report of the following BZ not being fixed by it's 4.0.7 clone:
BZ1400043 - [Vm Pool] VMs are created with duplicate MAC addresses

First try of new version (4.0.7) resulted in 12 VMs with Duplicate MACs.

Version-Release number of selected component (if applicable):
rhevm-4.0.7.4-0.1.el7ev.noarch

Comment 6 Germano Veit Michel 2017-03-24 05:44:22 UTC

Upgrade to 4.0.7 was from 4.0.6

Comment 7 Germano Veit Michel 2017-03-27 07:06:02 UTC

One interesting thing is that they have 5-6 VM Pools. Could this increase the probability of hitting the bug? It seems very easy to hit in that environment.

Comment 9 Alona Kaplan 2017-04-05 16:06:01 UTC

A possible reproduction of the bug -

-Make sure the MacPool used by the dc doesn't allow duplicates.

1. Create a template with one vnic ('tmp1').
2. Create VmPool ('pool') from 'tmp1' with 2 vms ('pool-1' and 'pool-2'). Set the number of prestarted vms as 2.
3. Wait for the vms to be up.
4. Unplug the nic from vm 'pool-1' (lets call its current mac address 'x'). Change its mac address (new mac 'y'). Plug it back.
5. Add a vnic to vm 'pool-2' and set its mac address to 'x' (the old mac address of the vnic we uplugged and plugged).
6. Stop vm 'pool-1'.

Result - Both vms 'pool-1' and 'pool-2' have vnic with 'x' mac.


Explanation of what causes the bug - when stopping a vm that was started by the pool, the original snapshot (before the run) is restored. The macs of the vnics in the original snapshot are added to the mac pool using 'forceAdd'. It means that it ignores if the mac is already in the pool.
So if a mac in the original snapshot was taken by another vm. We will end up with duplicate macs.

Comment 10 Germano Veit Michel 2017-04-24 01:44:15 UTC

Latest logs after a new test (with the snapshot related errors fixed) do not show the problem anymore.

I believe we are hitting the scenario Alona described, as the MAC Pool was close to been exhausted therefore the chances of another VM taking the MAC of the original snapshot were quite high.

Comment 25 Michael Burman 2017-09-18 14:13:33 UTC

This is failed QA, the result is the same a was befor ehte fix.
We don't reserve the origin MAC address for stateless VM and we can end up with duplicate MAC addresses because of that when the stateless VM is shutdown.

Comment 26 Dan Kenigsberg 2017-09-27 13:52:30 UTC

Rumor has it that currently merged code is fitting for QA.

Comment 27 Meni Yakove 2017-09-28 08:34:45 UTC

ovirt-engine-4.2.0-0.0.master.20170926175518.git0d20200.el7.centos.noarch

Comment 28 Michael Burman 2017-09-28 08:36:53 UTC

Verified on - 4.2.0-0.0.master.20170927183005.git49790b2.el7.centos

Comment 29 Michael Burman 2017-09-28 08:39:32 UTC

Summary and results:

Stateless scenarios - PASS
Statefull/snapshot scenarios - PASS
Regression - All new regression bugs which has been caused by the fix for this report has been verified
Tier 2 - PASS
MAC pool per cluster - no regression in the feature

Comment 33 RHV bug bot 2017-12-06 16:16:40 UTC

INFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Open patch attached]

For more info please contact: rhv-devops

Comment 34 RHV bug bot 2017-12-12 21:15:09 UTC

INFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Open patch attached]

For more info please contact: rhv-devops

Comment 35 RHV bug bot 2017-12-18 17:05:33 UTC

INFO: Bug status (VERIFIED) wasn't changed but the folowing should be fixed:

[Open patch attached]

For more info please contact: rhv-devops

Comment 37 Sandro Bonazzola 2018-04-18 11:17:47 UTC

Dan, this bug is verified but has an open patch attached: either drop the attached patch or move back this bug to POST.

Comment 38 Dan Kenigsberg 2018-04-24 13:58:21 UTC

This BZ has been (ab)used for examples on how to work around it. They have been merged 3 weeks ago, so it is not clear to me which patches you refer to.

Comment 41 errata-xmlrpc 2018-05-15 17:41:09 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:1488

Comment 42 Franta Kust 2019-05-16 13:05:04 UTC

BZ<2>Jira Resync

Note You need to log in before you can comment on or make changes to this bug.