Bug 1435485
| Summary: | [Pool] VMs are still created with duplicate MAC addresses after 4.0.7 upgrade | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Germano Veit Michel <gveitmic> | |
| Component: | ovirt-engine | Assignee: | Dan Kenigsberg <danken> | |
| Status: | CLOSED ERRATA | QA Contact: | Michael Burman <mburman> | |
| Severity: | urgent | Docs Contact: | ||
| Priority: | high | |||
| Version: | 4.0.7 | CC: | alkaplan, apinnick, bkorren, danken, gveitmic, lsurette, mburman, mkalinin, mtessun, myakove, rbalakri, Rhev-m-bugs, rmcswain, srevivo, subhat, usurse, ykaul, ylavi | |
| Target Milestone: | ovirt-4.2.0 | Keywords: | ZStream | |
| Target Release: | --- | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | Bug Fix | ||
| Doc Text: | 
       Previously, when creating large pools of virtual machines, the Engine assigned duplicate MAC addresses, causing network problems. In the current release, the problem has been solved by removing a problematic legacy method. 
 | 
        
        
        
        Story Points: | --- | |
| Clone Of: | ||||
| : | 1485688 (view as bug list) | Environment: | ||
| Last Closed: | 2018-05-15 17:41:09 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | Network | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 1492723 | |||
| Bug Blocks: | 1485688 | |||
| 
 
        
          Description
        
        
          Germano Veit Michel
        
        
        
        
        
          2017-03-23 23:46:12 UTC
        
       
      
      
      
    Upgrade to 4.0.7 was from 4.0.6 One interesting thing is that they have 5-6 VM Pools. Could this increase the probability of hitting the bug? It seems very easy to hit in that environment. A possible reproduction of the bug -
-Make sure the MacPool used by the dc doesn't allow duplicates.
1. Create a template with one vnic ('tmp1').
2. Create VmPool ('pool') from 'tmp1' with 2 vms ('pool-1' and 'pool-2'). Set the number of prestarted vms as 2.
3. Wait for the vms to be up.
4. Unplug the nic from vm 'pool-1' (lets call its current mac address 'x'). Change its mac address (new mac 'y'). Plug it back.
5. Add a vnic to vm 'pool-2' and set its mac address to 'x' (the old mac address of the vnic we uplugged and plugged).
6. Stop vm 'pool-1'.
Result - Both vms 'pool-1' and 'pool-2' have vnic with 'x' mac.
Explanation of what causes the bug - when stopping a vm that was started by the pool, the original snapshot (before the run) is restored. The macs of the vnics in the original snapshot are added to the mac pool using 'forceAdd'. It means that it ignores if the mac is already in the pool.
So if a mac in the original snapshot was taken by another vm. We will end up with duplicate macs.
    Latest logs after a new test (with the snapshot related errors fixed) do not show the problem anymore. I believe we are hitting the scenario Alona described, as the MAC Pool was close to been exhausted therefore the chances of another VM taking the MAC of the original snapshot were quite high. This is failed QA, the result is the same a was befor ehte fix. We don't reserve the origin MAC address for stateless VM and we can end up with duplicate MAC addresses because of that when the stateless VM is shutdown. Rumor has it that currently merged code is fitting for QA. ovirt-engine-4.2.0-0.0.master.20170926175518.git0d20200.el7.centos.noarch Verified on - 4.2.0-0.0.master.20170927183005.git49790b2.el7.centos Summary and results: Stateless scenarios - PASS Statefull/snapshot scenarios - PASS Regression - All new regression bugs which has been caused by the fix for this report has been verified Tier 2 - PASS MAC pool per cluster - no regression in the feature INFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason: [Open patch attached] For more info please contact: rhv-devops INFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason: [Open patch attached] For more info please contact: rhv-devops INFO: Bug status (VERIFIED) wasn't changed but the folowing should be fixed: [Open patch attached] For more info please contact: rhv-devops Dan, this bug is verified but has an open patch attached: either drop the attached patch or move back this bug to POST. This BZ has been (ab)used for examples on how to work around it. They have been merged 3 weeks ago, so it is not clear to me which patches you refer to. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:1488 BZ<2>Jira Resync  |