Bug 1547274

Summary: nova-compute will try to re-plug the vif even if it exists for vhostuser port
Product: Red Hat OpenStack Reporter: Matt Flusche <mflusche>
Component: python-os-vifAssignee: Sahid Ferdjaoui <sferdjao>
Status: CLOSED ERRATA QA Contact: Joe H. Rahme <jhakimra>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 10.0 (Newton)CC: asoni, awaugama, berrange, dasmith, eglynn, glamb, jjoyce, jmelvin, jraju, jschluet, jthomas, kchamart, marjones, mbooth, mflusche, pablo.iranzo, sbauza, sferdjao, sgordon, slinaber, srevivo, tvignaud, vromanso, weiyongjun
Target Milestone: z8Keywords: Triaged, ZStream
Target Release: 10.0 (Newton)   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: python-os-vif-1.2.1-4.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1552115 (view as bug list) Environment:
Last Closed: 2018-05-17 15:49:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1552115, 1552119, 1552123    

Description Matt Flusche 2018-02-20 22:17:32 UTC
Description of problem:
OSP 10 Customer is running into an issue that looks like this upstream bug:
https://bugs.launchpad.net/nova/+bug/1670628

Immediately after restart nova-compute we see the following in nova-compute.log:
2018-02-14 11:00:03.997 457631 INFO os_vif [req-2c196a54-e3c4-4c7b-bddd-6a3867e90ce1 - - - - -] Successfully plugged vif VIFVHostUser(active=True,address=ff:ff:ff:ff:ff:ff,has_traffic
_filtering=False,id=104c20cf-cd37-4d6e-b1c6-a1a8e8805663,mode='client',network=Network(05ae462b-fdc9-41ee-9e9d-68b329066525),path='/var/run/openvswitch/vhu104c20cf-cd',plugin='ovs',po
rt_profile=VIFPortProfileBase,preserve_on_delete=True,vif_name=<?>)

And then the following in openvswitch-agent.log

2018-02-14 11:00:07.947 3916 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-751c6c87-f0b7-41e8-b9e4-bf6432438f8e - - - - -] Port 'vhu104c20cf-cd' has lost its vlan tag '1'!


Version-Release number of selected component (if applicable):
python-nova-14.0.8-2.el7ost.noarch



How reproducible:
100% in customer environment


Steps to Reproduce:
1. restart nova-compute
2. nova-compute re-plugs vif and instances loses netwowrking 
3. neutron agent detects missing vlan and rebuilds the port 

Actual results:
instance network outage during restart of nova-compute for dpdk/vhostuser interfaces


Expected results:
Restarting nova-compute should not impact running instances

Additional info:
Will provide links for additional log details

Comment 2 Sahid Ferdjaoui 2018-02-21 08:00:06 UTC
It seems that we should change the component of this issue to os-vif.

The issue is valid, when we call OVS to create the ports, we first use the condition 'if-exists del-port' meaning that we delete the port. In OSP9 and OSP10 we use port type dpdkvhostuser, so deleting the ports result that the instance is loosing connectivity.

For OSP11...OSPXX we are using dpdkvhostuserclient and we pass to OVS the name of the socket it should not be a problem to delete it.

Not sure the fix will pass upstream since using dpdkvhostuser is deprecated but if that does not work we could still provide downstream-only fix for OSP9 and OSP10.

Comment 36 errata-xmlrpc 2018-05-17 15:49:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1596