Bug 1534565 - [OVN] - Engine sometimes doesn't update the provider that the port has changed
Summary: [OVN] - Engine sometimes doesn't update the provider that the port has changed
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Network
Version: 4.2.1.1
Hardware: x86_64
OS: Linux
low
high
Target Milestone: ovirt-4.3.0
: ---
Assignee: Marcin Mirecki
QA Contact: Michael Burman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-01-15 13:39 UTC by Michael Burman
Modified: 2018-12-12 10:44 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-12-12 10:42:09 UTC
oVirt Team: Network
Embargoed:
rule-engine: ovirt-4.3+


Attachments (Terms of Use)
engine log (574.06 KB, application/x-gzip)
2018-01-24 14:59 UTC, Michael Burman
no flags Details

Description Michael Burman 2018-01-15 13:39:25 UTC
Description of problem:
[OVN] - Engine sometimes doesn't update the provider that the port has changed.

If starting VM with ovn network and then we create new ovn network and updating the vNIC with new ovn network, then on VM run the VM will get the IP from the first subnet and not the new one. 
port wasn't updated on northdb host. 

Version-Release number of selected component (if applicable):
4.2.1.1-0.1.el7

How reproducible:
I still can't understand how to reproduce it, but the bug is there for sure. 
Saw it multiple times, but can't understand how to recreate the issue.
We saw it together with danken now and i'm reporting the bug, i will try to understand how to reproduce it.

Comment 1 Dan Kenigsberg 2018-01-18 19:56:57 UTC
The severity of this is very high; a VM may end up connected to some thing other than what the user has intended.

Yet we cannot proceed without reproduction. Please try messing with vNICs, external networks, and their interconnections until you find it. Maybe you can even dig old logs form the system where it manifested itself lately.

Comment 2 Michael Burman 2018-01-21 06:47:29 UTC
(In reply to Dan Kenigsberg from comment #1)
> The severity of this is very high; a VM may end up connected to some thing
> other than what the user has intended.
> 
> Yet we cannot proceed without reproduction. Please try messing with vNICs,
> external networks, and their interconnections until you find it. Maybe you
> can even dig old logs form the system where it manifested itself lately.

Although the severity is high, we don't know how to reproduce it. 
Already tried it. Will update once i have any news about this report.

Comment 3 Michael Burman 2018-01-24 14:27:53 UTC
Dan,

I think i managed to reproduce this issue with ovn localnet physnet(almost sure that this is the same issue) as it's not happens with regular networks and it's reproduced 100% with ovn local net. 

The flow is:

1) Create new physnet network(data center network) with vlan 162
2) Attach the physnet network to the host 
3) Create new ovn network + choose create on external provider + choose the data center network(physnet network) from step 1^^ without subnet
4) Run VM with ovn network vNIC - VM got IP from vlan 162
5) Shutdown the VM
6) Edit the physnet network with new vlan tag 163 - All changes applied successfully on the host
7) Start VM - 

Result - VM got IP from vlan 162 and not vlan 163, so it smells exactly like this bug.

* NOTE - The exact same flow, with regular vlan network(no ovn involved) working as expected.

Comment 4 Michael Burman 2018-01-24 14:59:26 UTC
Created attachment 1385629 [details]
engine log

Comment 5 Michael Burman 2018-01-25 15:27:03 UTC
Note, that after getting to such situation, it's not possible to delete the network via the engine + provider - 

2018-01-25 17:25:05,713 root Unable to delete network 86ff0d4b-2f25-4979-ab5c-bfbe8837482c. Ports exist for the network
Traceback (most recent call last):
  File "/usr/share/ovirt-provider-ovn/handlers/base_handler.py", line 131, in _handle_request
    method, path_parts, content)
  File "/usr/share/ovirt-provider-ovn/handlers/selecting_handler.py", line 175, in handle_request
    return self.call_response_handler(handler, content, parameters)
  File "/usr/share/ovirt-provider-ovn/handlers/neutron.py", line 36, in call_response_handler
    return response_handler(ovn_north, content, parameters)
  File "/usr/share/ovirt-provider-ovn/handlers/neutron_responses.py", line 117, in delete_network
    nb_db.delete_network(parameters[NETWORK_ID])
  File "/usr/share/ovirt-provider-ovn/ovndb/ovn_north.py", line 189, in delete_network
    % network_id
RestDataError

Comment 6 Dan Kenigsberg 2018-04-29 15:40:10 UTC
Comment 3: changing the vlan of a physnet does not update the vlan of the external networks defined on top of it. that's a known issue, that may deserve a clear bug

Comment 5: I think it's ok that you cannot delete the network - you should first remove the vNIC that uses it.

if the only issue you see here is that of comment 3, you can rename this bug to cover it alone.

Comment 7 Michael Burman 2018-04-30 14:27:35 UTC
(In reply to Dan Kenigsberg from comment #6)
> Comment 3: changing the vlan of a physnet does not update the vlan of the
> external networks defined on top of it. that's a known issue, that may
> deserve a clear bug
> 
> Comment 5: I think it's ok that you cannot delete the network - you should
> first remove the vNIC that uses it.
> 
> if the only issue you see here is that of comment 3, you can rename this bug
> to cover it alone.

Comment3 is still relevant indeed, do you want a new bug? i'm not sure this is exactly the origin report, it's only one aspect of it, but we can't reproduce the origin issue. 

Comment5 can't reproduce, guess it's another bug on ovn side hiding somewhere.

Comment 8 Dan Kenigsberg 2018-04-30 14:59:09 UTC
yes, I think a new bug with a clear subject about comment 3 would help users. We can keep this bug for the other mystery occasion where Engine does not update the provider; we may eventually close it, until we have steps for reproduction.

Comment 9 Michael Burman 2018-05-01 06:21:47 UTC
(In reply to Dan Kenigsberg from comment #8)
> yes, I think a new bug with a clear subject about comment 3 would help
> users. We can keep this bug for the other mystery occasion where Engine does
> not update the provider; we may eventually close it, until we have steps for
> reproduction.

ACK

This is the physnet vlan bug - BZ 1573408

Comment 10 Dan Kenigsberg 2018-12-12 10:42:09 UTC
Let us reopen this if it reproduces in a clear fashion.


Note You need to log in before you can comment on or make changes to this bug.