Bug 1561081 - [downstream clone - 4.1.11] MacPool fails to initialize when it contains duplicates and user disallows duplicates
Summary: [downstream clone - 4.1.11] MacPool fails to initialize when it contains dupl...
Keywords:
Status: CLOSED DUPLICATE of bug 1561080
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.1.9
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ovirt-4.1.11
: ---
Assignee: eraviv
QA Contact: Meni Yakove
URL:
Whiteboard:
Depends On: 1554180 1561865
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-03-27 15:07 UTC by RHV bug bot
Modified: 2021-06-10 15:33 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1554180
Environment:
Last Closed: 2018-03-27 15:14:59 UTC
oVirt Team: Network
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3377611 0 None None None 2018-03-27 15:08:45 UTC
oVirt gerrit 89138 0 master MERGED engine: unset 'allow duplicates' on mac pool - validate 2018-03-27 15:08:45 UTC
oVirt gerrit 89439 0 master MERGED core: unset 'allow duplicates' on mac pool - rollback 2018-03-28 11:12:20 UTC

Description RHV bug bot 2018-03-27 15:07:59 UTC
+++ This bug is a downstream clone. The original bug is: +++
+++   bug 1554180 +++
======================================================================

Description of problem:

When a MacPool contains duplicate macs and the user attempts to uncheck the "Allow Duplicates" option for the Pool, it fails with no clear indication on the GUI on why it failed, and even worse, now the entire engine starts misbehaving as the MacPool is not reinitialized again after it failed to initialize with allow duplicates disabled.

So.. the engine is running fine, MacPool does not allow duplicates. Then the user configures it to allow duplicates, add a duplicate MAC VM. All fine.

Now the user tries to disable the allow duplicates option, everything goes to limbo.

1) To start with, this is the message on the UI the user gets when attempting to disable the allow duplicate mac option:

"Error while executing action UpdateMacPool: Internal Engine Error"

On engine logs, we get this, indicating there is are duplicate MACs. Why is the user not notified about this in the UI?

2018-03-12 11:21:57,383+10 ERROR [org.ovirt.engine.core.bll.UpdateMacPoolCommand] (default task-4) [b7b3a181-659b-411e-9da6-4b7ba16802e8] Command 'org.ovirt.engine.core.bll.UpdateMacPoolCommand' failed: EngineException: Unable to initialize MAC pool due to existing duplicates (Failed with error MAC_POOL_INITIALIZATION_FAILED and code 5010)

But here comes the worst part:

2) All sorts of operations start to fail, as the MacPool did not initialize again after the above and the mac pool cannot be found. AddVm, RemoveVm, AddNic, RemoveNic... all fail:

2018-03-12 11:36:31,896+10 ERROR [org.ovirt.engine.core.bll.RemoveVmCommand] (org.ovirt.thread.pool-6-thread-15) [97376231-9e23-4ca7-b349-e7d6dcdb7725] Exception: javax.ejb.EJBTransactionRolledbackException: Pool for id="0000002f-002f-002f-002f-000000000108" does not exist

3) And things get ugly, even if the user identifies a duplicate MAC, removing a VM/NIC now fails with confusing unrelated messages. See what is displayed when a VM fails to remove, due to the missing MacPool.

2018-03-12 11:36:31,896+10 ERROR [org.ovirt.engine.core.bll.RemoveVmCommand] (org.ovirt.thread.pool-6-thread-15) [97376231-9e23-4ca7-b349-e7d6dcdb7725] Exception: javax.ejb.EJBTransactionRolledbackException: Pool for id="0000002f-002f-002f-002f-000000000108" does not exist

2018-03-12 11:36:31,905+10 INFO  [org.ovirt.engine.core.bll.RemoveVmCommand] (org.ovirt.thread.pool-6-thread-15) [97376231-9e23-4ca7-b349-e7d6dcdb7725] Command [id=c941804f-510e-4e4d-974a-c731a5c258de]: Compensating CHANGED_STATUS_ONLY of org.ovirt.engine.core.common.businessentities.VmDynamic; snapshot: EntityStatusSnapshot:{id='61537e85-1363-4ce6-b949-d57d9a886b49', status='Down'}.

2018-03-12 11:36:31,933+10 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-15) [97376231-9e23-4ca7-b349-e7d6dcdb7725] EVENT_ID: USER_REMOVE_VM_FINISHED_WITH_ILLEGAL_DISKS(172), Correlation ID: 97376231-9e23-4ca7-b349-e7d6dcdb7725, Job ID: 1a6c79a0-c1a3-4eaf-92fb-c09d579464da, Call Stack: null, Custom ID: null, Custom Event ID: -1, Message: VM germano-test5 has been removed, but the following disks could not be removed: <UNKNOWN>. These disks will appear in the main disks tab in illegal state, please remove manually when possible.

3) Even editing the MAC Pool fails with the same as above (Pool id does not exists)

The average user will be completely lost on what is going on.

Version-Release number of selected component (if applicable):
ovirt-engine-4.1.9.2-0.1.el7.noarch

How reproducible:
Always

Steps to Reproduce:
1. Enable duplicate MACs on the Pool
2. Create some duplicate MAC
3. Disable duplicate MACs on the Pool.
4. Try any flow that uses the MacPool.

Actual results:
MacPool fails to initialize, user has no hints on what is going on. Engine is misbehaving with no clear indications.

Expected results:
- If there are duplicate MACs, MacPool must initialize again after disabling duplicate macs, otherwise the whole engine is broken.
- Adequate message displayed to the user when disabling duplicate MACs for a Pool that has duplicate MACs.

(Originally by Germano Veit Michel)

Comment 1 RHV bug bot 2018-03-27 15:08:07 UTC
Just to document here the way out of this:

1) reinitialize engine (thankfully allow duplicate macs is still enabled in the DB), so macpool will initialize fine if we restart the engine.

2) remove duplicate mac
To find them:
/usr/share/ovirt-engine/dbscripts/engine-psql.sh -c "select vm_static.vm_name,mac_addr,creation_date from vm_interface,vm_static where vm_interface.vm_guid = vm_static.vm_guid and mac_addr in (select mac_addr from vm_interface group by mac_addr having (count(*) >1)) order by mac_addr;"

3) uncheck allow duplicates

(Originally by Germano Veit Michel)

Comment 4 RHV bug bot 2018-03-27 15:08:22 UTC
(In reply to Germano Veit Michel from comment #1)
> Just to document here the way out of this:
> 
> 1) reinitialize engine (thankfully allow duplicate macs is still enabled in
> the DB), so macpool will initialize fine if we restart the engine.
> 
> 2) remove duplicate mac
> To find them:
> /usr/share/ovirt-engine/dbscripts/engine-psql.sh -c "select
> vm_static.vm_name,mac_addr,creation_date from vm_interface,vm_static where
> vm_interface.vm_guid = vm_static.vm_guid and mac_addr in (select mac_addr
> from vm_interface group by mac_addr having (count(*) >1)) order by mac_addr;"

Can you contribute this script? 


> 
> 3) uncheck allow duplicates

(Originally by Yaniv Kaul)

Comment 5 RHV bug bot 2018-03-27 15:08:27 UTC
(In reply to Yaniv Kaul from comment #3)
> Can you contribute this script?

This was already contributed in a much better form here as a tool to detect and fix duplicate macs: https://gerrit.ovirt.org/#/c/83415/
Looks like it may ship with 4.3 if it gets merged someday.

Or do you have something else in mind where such logic would be useful?

(Originally by Germano Veit Michel)

Comment 6 RHV bug bot 2018-03-27 15:08:31 UTC
(In reply to Germano Veit Michel from comment #4)
> This was already contributed in a much better form here as a tool to detect
> and fix duplicate macs: https://gerrit.ovirt.org/#/c/83415/
> Looks like it may ship with 4.3 if it gets merged someday.

We can merge and ship your script even earlier, but it needs to be Verfied+1, CI+1 (and make sure it is really shipped in ovirt-engine.rpm)

(Originally by danken)

Comment 8 Dan Kenigsberg 2018-03-27 15:14:59 UTC
oops, I've ran the clone job twice.

*** This bug has been marked as a duplicate of bug 1561080 ***

Comment 9 Franta Kust 2019-05-16 13:05:15 UTC
BZ<2>Jira Resync


Note You need to log in before you can comment on or make changes to this bug.