Description of problem: OSP 10 Customer is running into an issue that looks like this upstream bug: https://bugs.launchpad.net/nova/+bug/1670628 Immediately after restart nova-compute we see the following in nova-compute.log: 2018-02-14 11:00:03.997 457631 INFO os_vif [req-2c196a54-e3c4-4c7b-bddd-6a3867e90ce1 - - - - -] Successfully plugged vif VIFVHostUser(active=True,address=ff:ff:ff:ff:ff:ff,has_traffic _filtering=False,id=104c20cf-cd37-4d6e-b1c6-a1a8e8805663,mode='client',network=Network(05ae462b-fdc9-41ee-9e9d-68b329066525),path='/var/run/openvswitch/vhu104c20cf-cd',plugin='ovs',po rt_profile=VIFPortProfileBase,preserve_on_delete=True,vif_name=<?>) And then the following in openvswitch-agent.log 2018-02-14 11:00:07.947 3916 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-751c6c87-f0b7-41e8-b9e4-bf6432438f8e - - - - -] Port 'vhu104c20cf-cd' has lost its vlan tag '1'! Version-Release number of selected component (if applicable): python-nova-14.0.8-2.el7ost.noarch How reproducible: 100% in customer environment Steps to Reproduce: 1. restart nova-compute 2. nova-compute re-plugs vif and instances loses netwowrking 3. neutron agent detects missing vlan and rebuilds the port Actual results: instance network outage during restart of nova-compute for dpdk/vhostuser interfaces Expected results: Restarting nova-compute should not impact running instances Additional info: Will provide links for additional log details
It seems that we should change the component of this issue to os-vif. The issue is valid, when we call OVS to create the ports, we first use the condition 'if-exists del-port' meaning that we delete the port. In OSP9 and OSP10 we use port type dpdkvhostuser, so deleting the ports result that the instance is loosing connectivity. For OSP11...OSPXX we are using dpdkvhostuserclient and we pass to OVS the name of the socket it should not be a problem to delete it. Not sure the fix will pass upstream since using dpdkvhostuser is deprecated but if that does not work we could still provide downstream-only fix for OSP9 and OSP10.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:1596