Bug 1980436

Summary: Neutron misses OVSDB events under the load
Product: Red Hat OpenStack Reporter: Asma Syed Hameed <asyedham>
Component: openstack-neutronAssignee: Miro Tomaska <mtomaska>
Status: ASSIGNED --- QA Contact: Eran Kuris <ekuris>
Severity: high Docs Contact:
Priority: high    
Version: 16.1 (Train)CC: alifshit, chrisw, dasmith, egarciar, eglynn, froyo, jappleii, jhakimra, jlibosva, jraju, kchamart, mlavalle, ralonsoh, sbauza, scohen, sgordon, vkommadi, vromanso
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 1 Artom Lifshitz 2021-07-09 19:47:36 UTC
While this is manifesting itself in Nova, I'm fairly certain the root cause is in Neutron/OVN. Nova waits for Neutron to tell it that the vif has been plugged by sending a so called "external event". It's basically Neutron making a POST request to the Nova API telling it "I've plugged this port." In this case, Nova never receives those events, and fails the VM creation.

Comment 2 anil venkata 2021-07-12 08:20:25 UTC
Artome, but why nova is timing out after 300 seconds though we increased it to to 1200 (i.e  vif_plugging_timeout=1200)?

Comment 3 Artom Lifshitz 2021-07-13 09:01:19 UTC
(In reply to anil venkata from comment #2)
> Artome, but why nova is timing out after 300 seconds though we increased it
> to to 1200 (i.e  vif_plugging_timeout=1200)?

Where did you make this change? This should normally be done on the computes, and nova_compute has to be restarted after the change for it to take effect.

Comment 4 anil venkata 2021-07-13 10:55:09 UTC
yes, we have added in the compute and restarted nova_compute

Comment 6 Artom Lifshitz 2021-07-19 10:13:19 UTC
(In reply to anil venkata from comment #4)
> yes, we have added in the compute and restarted nova_compute

I'm not sure what to tell you. I've tried searching through the logs for any mention of vif_plugging_timeout, but without success. Any Nova service should log its configuration when it starts, so the fact that I couldn't find it logged anywhere would indicate that perhaps it wasn't restarted?

To be explicit, the option is in the [DEFAULT] section in nova.conf that's shared between the nova_libvirt and nova_compute containers. So to set it, you need to edit the file that's somewhere in /var/lib/puppet-generated for the nova_libvirt containers (apologies for not providing the exact path, I don't have a 16.1 environment handy to double check).

I hope that helps a bit.