Bug 1492577
Summary: | overlap between MAC pools causes annoying "mac already in use" errors | ||
---|---|---|---|
Product: | [oVirt] ovirt-engine | Reporter: | Michael Burman <mburman> |
Component: | BLL.Network | Assignee: | eraviv |
Status: | CLOSED DUPLICATE | QA Contact: | Meni Yakove <myakove> |
Severity: | medium | Docs Contact: | |
Priority: | high | ||
Version: | 4.2.0 | CC: | bpelled, bugs, danken, dholler, edwardh, mburman, ylavi |
Target Milestone: | ovirt-4.3.0 | Flags: | ylavi:
ovirt-4.3+
rule-engine: blocker+ |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-08-08 08:49:58 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | Network | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1537414 |
Description
Michael Burman
2017-09-18 08:36:02 UTC
Correction - - Create 2 clusters in the DC, CL1 and CL2 - Create custom MAC pool range called 'mac1' with range 00:00:00:00:00:20-00:00:00:00:00:21 and set for both clusters. - Create 2 VMs, each VM in a different cluster VM1-CL1 and VM2-CL2 - Add vNIC to each VM - VM1 got 00:00:00:00:00:20 and VM2 got 00:00:00:00:00:21 - Try to add vNIC for VM1 in CL1 - Failed. 00:00:00:00:00:21 MAC address is already in use. - Try to add vNIC for VM2 in CL2 - Failed. 00:00:00:00:00:20 MAC address is already in use. (In reply to Michael Burman from comment #1) > Correction - > > - Create 2 clusters in the DC, CL1 and CL2 > - Create custom MAC pool range called 'mac1' with range > 00:00:00:00:00:20-00:00:00:00:00:21 and set for both clusters. Are you using a single mac pool for both clusters? if so, what you describe is the expected behavior. Please check what happens when you create TWO mac pools (with overlapping ranges), attaching each to its own cluster. (In reply to Dan Kenigsberg from comment #2) > (In reply to Michael Burman from comment #1) > > Correction - > > > > - Create 2 clusters in the DC, CL1 and CL2 > > - Create custom MAC pool range called 'mac1' with range > > 00:00:00:00:00:20-00:00:00:00:00:21 and set for both clusters. > > Are you using a single mac pool for both clusters? if so, what you describe > is the expected behavior. Please check what happens when you create TWO mac > pools (with overlapping ranges), attaching each to its own cluster. What do you mean it's the expected behavior?? so what is the meaning of this feature exactly? this is NOT the expected behavior. And creating TWO mac pools (with overlapping ranges), attaching each to its own cluster give the same result. Engine trying to re-assign the same MAC addresses and failing with MAC already in use. we did not read description correctly. 2 mac pools with overlapping ranges does not provide any guarantee. 1 mac used in two clusters should not return same mac even if duplicates are allowed. This is problem we have to investigate. TLDR: 1. UI is severely broken, it does not show reality consistent with DB and present nonsenses. Debug this bug using DB. See details below if you want. 2. burman is somewhat correct in his claims, however it has nothing to do with mac pools. Jump to paragraph marked with (*******) for details. 3. please ask testers testing VM tab, to re-verify my claims about broken UI. ——— During extended period of testing, I did not find anything wrong with mac pools. I did find a lot of errors on UI though. And I did find one MAYBE unwanted behavior, I don't know. Let me define, up to my best understanding, how system should behave: • one mac pool, used by 2 clusters: pool behaves like fridge; it's shared by all members, whoever comes first, takes an item. Therefore, 2 VMs, each from different cluster, both of same mac pool, no two VMs should got same mac. • two mac pools, used by 2 cluster: just as before, only now we have 2 identical fridges. Therefore, we should see same allocations in same order in both clusters. If VM1 got mac A, VM2 in other cluster, accessing different, but completely identical map pool, should get same mac A. As aforementioned, I did not see any wrongdoing of mac pools. (*******) When we test scenario, when we have 2 different, but identical mac pools in their settings, then I expected, that there will be duplicate allocations; not in one mac pool, but across multiple pools. This is blocked by validation. We have validation constraining, that there cannot be, regardless of mac pools and who is using mac, two or more plugged nics with same mac. It has nothing to do with mac pools, though. Please decide this behavior, relevant code is in: org.ovirt.engine.core.bll.network.VmInterfaceManager#existsPluggedInterfaceWithSameMac Now about UI: • when I'm creating new VM, at very bottom, there is nic creation. If I select ovirt-mgmt network, and then select it back to blank, I will endup with new vm WITH nic. • when I do not touch nic combo at all, result is randomized. Sometimes system always create nic, another day it never create nic (as it should, at least as I'd expect). • If you happen to use UI when it works like: "creating VM without nic", then you create 2 VMs, each does not have any NIC. Then select 1 VM, and add 1 nic to it. Go do db and verify it. 1 nic. Go back to UI, check both VMs nics. Based what I see now, both of them has 1 nic. Meaning, UI is not displaying reality. Even more. I have created 2 vms, each with multitude of nics. Random subsets of created nics were reported for each VM after very short period of testing. I deleted both VMs, and created empty VMs as described with same names. There were 0 nics in DB, but both VMs reported 2 nics. Page refresh fixed that, one nic addition broke UI again. --> these are just things I see when using UI. I did not find actual cause, and I wasn't searching for it, as this is not part of this bug. Individual claims need not to be true, something else may cause these failures. For me it was sufficient to verify, that I cannot trust what UI is presenting. ◦ unrelated error — multiselection of VMs is not possible without opening VM detail. After detail opened after trying to multiselect, remove does not remove individual VM, which it should since we're in VM detail, but all selected VMs. Dominik, it seems to me that existsPluggedInterfaceWithSameMac is an arcane system-wide validation, defying the purpose of MAC pools, but you have (relatively recently) used it when fixing bug 1404130. Could you see if it can be removed? Unfortunately, this issue has a long story. In the beginning, there was https://bugzilla.redhat.com/show_bug.cgi?id=873338 , which is the reason to introduce existsPluggedInterfaceWithSameMac. But then, there was https://bugzilla.redhat.com/show_bug.cgi?id=1212461 . This was fixed by https://gerrit.ovirt.org/#/c/40052/ . But this fix introduced https://bugzilla.redhat.com/show_bug.cgi?id=1266172 . So https://gerrit.ovirt.org/#/c/40052/ was reverted by https://gerrit.ovirt.org/#/c/46704/ . Later an incarnation of the initial bug popped up as https://bugzilla.redhat.com/show_bug.cgi?id=1404130 . I have to dive in the relation of snapshots and mac pools more deeper, to decide how this issue can be fixed. Moving to Leon, since his new test suite makes it possible to write a proper test coverage, making sure we do not reintroduced a known bug again. Lets get back to the requirements and design stage: Why will anyone want to have overlapping ranges between mac pools? mac addresses, by definitions are unique, across all domains (vlans, networks). Opening it up here make no sense and requires a really good reason. Overlapping ranges should be blocked. Doesn't exist in 4.1, only 4.2 This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP. (In reply to Michael Burman from comment #13) > Doesn't exist in 4.1, only 4.2 Alona managed to reproduce on 4.1, removing the regression flag. *** This bug has been marked as a duplicate of bug 1593800 *** |