Description of problem: On large hypervisors (3 TB RAM) and in very dynamic environments (instances are created and deleted continiously in fast pace) creation of an instance sometimes fails with "VirtualInterfaceCreateException: Virtual Interface creation failed". An error occurs for a large number of instances and lasts for a while, but then disappears and the instances are created without errors. The following is in nova log: ~~~ ERROR nova.compute.manager [instance: <uuid>] ERROR nova.compute.manager [req-<uuid> - - -] [instance: <uuid>] Failed to allocate network(s) ERROR nova.compute.manager [instance: <uuid>] Traceback (most recent call last): ERROR nova.compute.manager [instance: <uuid>] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1928, in _build_and_run_instance ERROR nova.compute.manager [instance: <uuid>] block_device_info=block_device_info) ERROR nova.compute.manager [instance: <uuid>] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 2674, in spawn ERROR nova.compute.manager [instance: <uuid>] destroy_disks_on_failure=True) ERROR nova.compute.manager [instance: <uuid>] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5005, in _create_domain_and_network ERROR nova.compute.manager [instance: <uuid>] raise exception.VirtualInterfaceCreateException() ERROR nova.compute.manager [instance: <uuid>] VirtualInterfaceCreateException: Virtual Interface creation failed ERROR nova.compute.manager [instance: <uuid>] ERROR nova.compute.manager [req-<uuid> - - -] [instance: <uuid>] Build of instance <uuid> aborted: Failed to allocate the network(s), not rescheduling. ERROR nova.compute.manager [instance: <uuid>] Traceback (most recent call last): ERROR nova.compute.manager [instance: <uuid>] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1787, in _do_build_and_run_instance ERROR nova.compute.manager [instance: <uuid>] filter_properties) ERROR nova.compute.manager [instance: <uuid>] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1968, in _build_and_run_instance ERROR nova.compute.manager [instance: <uuid>] reason=msg) ERROR nova.compute.manager [instance: <uuid>] BuildAbortException: Build of instance <uuid> aborted: Failed to allocate the network(s), not rescheduling. ~~~ Overall symptoms look like main reason for rpc_loop's long run is system starving for resources (instances vs host) and hitting timeout (long port deletions): ~~~ Loop iteration exceeded interval (2 vs. 4306.8187561)! ~~~ Workaround for this would be changing CPU allocation policy on hypervisors to not use 'isolcpus' option, after which VirtualInterfaceCreateException errors are gone: https://access.redhat.com/solutions/2884991 Version-Release number of selected component (if applicable): - RHOSP 10 - openstack-neutron-common-9.4.1-5.el7ost.noarch - openstack-neutron-openvswitch-9.4.1-5.el7ost.noarch - python-neutron-9.4.1-5.el7ost.noarch - python-neutron-lib-0.4.0-1.el7ost.noarch - python-neutronclient-6.0.1-1.el7ost.noarch
Just wanted to make sure needinfo was set so this is visible to Pablo.
Adding fixed-in version as this issue should already be fixed.
According to our records, this should be resolved by openstack-neutron-9.4.1-32.el7ost. This build is available now.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2019:0916