Description of problem: [OVN] - Engine sometimes doesn't update the provider that the port has changed. If starting VM with ovn network and then we create new ovn network and updating the vNIC with new ovn network, then on VM run the VM will get the IP from the first subnet and not the new one. port wasn't updated on northdb host. Version-Release number of selected component (if applicable): 4.2.1.1-0.1.el7 How reproducible: I still can't understand how to reproduce it, but the bug is there for sure. Saw it multiple times, but can't understand how to recreate the issue. We saw it together with danken now and i'm reporting the bug, i will try to understand how to reproduce it.
The severity of this is very high; a VM may end up connected to some thing other than what the user has intended. Yet we cannot proceed without reproduction. Please try messing with vNICs, external networks, and their interconnections until you find it. Maybe you can even dig old logs form the system where it manifested itself lately.
(In reply to Dan Kenigsberg from comment #1) > The severity of this is very high; a VM may end up connected to some thing > other than what the user has intended. > > Yet we cannot proceed without reproduction. Please try messing with vNICs, > external networks, and their interconnections until you find it. Maybe you > can even dig old logs form the system where it manifested itself lately. Although the severity is high, we don't know how to reproduce it. Already tried it. Will update once i have any news about this report.
Dan, I think i managed to reproduce this issue with ovn localnet physnet(almost sure that this is the same issue) as it's not happens with regular networks and it's reproduced 100% with ovn local net. The flow is: 1) Create new physnet network(data center network) with vlan 162 2) Attach the physnet network to the host 3) Create new ovn network + choose create on external provider + choose the data center network(physnet network) from step 1^^ without subnet 4) Run VM with ovn network vNIC - VM got IP from vlan 162 5) Shutdown the VM 6) Edit the physnet network with new vlan tag 163 - All changes applied successfully on the host 7) Start VM - Result - VM got IP from vlan 162 and not vlan 163, so it smells exactly like this bug. * NOTE - The exact same flow, with regular vlan network(no ovn involved) working as expected.
Created attachment 1385629 [details] engine log
Note, that after getting to such situation, it's not possible to delete the network via the engine + provider - 2018-01-25 17:25:05,713 root Unable to delete network 86ff0d4b-2f25-4979-ab5c-bfbe8837482c. Ports exist for the network Traceback (most recent call last): File "/usr/share/ovirt-provider-ovn/handlers/base_handler.py", line 131, in _handle_request method, path_parts, content) File "/usr/share/ovirt-provider-ovn/handlers/selecting_handler.py", line 175, in handle_request return self.call_response_handler(handler, content, parameters) File "/usr/share/ovirt-provider-ovn/handlers/neutron.py", line 36, in call_response_handler return response_handler(ovn_north, content, parameters) File "/usr/share/ovirt-provider-ovn/handlers/neutron_responses.py", line 117, in delete_network nb_db.delete_network(parameters[NETWORK_ID]) File "/usr/share/ovirt-provider-ovn/ovndb/ovn_north.py", line 189, in delete_network % network_id RestDataError
Comment 3: changing the vlan of a physnet does not update the vlan of the external networks defined on top of it. that's a known issue, that may deserve a clear bug Comment 5: I think it's ok that you cannot delete the network - you should first remove the vNIC that uses it. if the only issue you see here is that of comment 3, you can rename this bug to cover it alone.
(In reply to Dan Kenigsberg from comment #6) > Comment 3: changing the vlan of a physnet does not update the vlan of the > external networks defined on top of it. that's a known issue, that may > deserve a clear bug > > Comment 5: I think it's ok that you cannot delete the network - you should > first remove the vNIC that uses it. > > if the only issue you see here is that of comment 3, you can rename this bug > to cover it alone. Comment3 is still relevant indeed, do you want a new bug? i'm not sure this is exactly the origin report, it's only one aspect of it, but we can't reproduce the origin issue. Comment5 can't reproduce, guess it's another bug on ovn side hiding somewhere.
yes, I think a new bug with a clear subject about comment 3 would help users. We can keep this bug for the other mystery occasion where Engine does not update the provider; we may eventually close it, until we have steps for reproduction.
(In reply to Dan Kenigsberg from comment #8) > yes, I think a new bug with a clear subject about comment 3 would help > users. We can keep this bug for the other mystery occasion where Engine does > not update the provider; we may eventually close it, until we have steps for > reproduction. ACK This is the physnet vlan bug - BZ 1573408
Let us reopen this if it reproduces in a clear fashion.