Bug 1552545
Summary: | OVN fails to update port to active, fails spawning VMs | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Eran Kuris <ekuris> |
Component: | python-networking-ovn | Assignee: | Lucas Alvares Gomes <lmartins> |
Status: | CLOSED ERRATA | QA Contact: | Eran Kuris <ekuris> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 13.0 (Queens) | CC: | abregman, amuller, apevec, bcafarel, dalvarez, jschluet, lhh, lmartins, majopela, nyechiel |
Target Milestone: | beta | Keywords: | AutomationBlocker, Triaged |
Target Release: | 13.0 (Queens) | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | python-networking-ovn-4.0.1-0.20180420150809.c7c16d4.el7ost | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-06-27 13:35:05 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Eran Kuris
2018-03-07 10:15:59 UTC
cat /etc/yum.repos.d/latest-installed 13 -p 2018-04-10.2 Manual steps to reproduce on OVN-HA with DVR and VLAN configuration: 1.openstack network create net-64-2 2.openstack subnet create --subnet-range 10.0.2.0/24 --network net-64-2 --dhcp subnet_4_2 3.openstack network create net-64-1 4.openstack subnet create --subnet-range 10.0.1.0/24 --network net-64-1 --dhcp subnet_4_1 5.openstack router create Router_eNet 6.openstack router add subnet Router_eNet subnet_4_2 7.openstack router add subnet Router_eNet subnet_4_1 8.openstack router set --external-gateway nova Router_eNet in nova-compute there is error: 2018-04-17 07:16:13.354 1 ERROR nova.compute.manager [req-54f38f84-ef69-4e5c-832e-62a4bde2c526 dd50544ada704bd28354e42a36b9137d a3b8b186235a4d1cbffc9305765e9888 - default default] [instance: 84d16a42-8ab3-4b30-836b-c161d95ddd82] Instance failed to spawn: VirtualInterfaceCreateException: Virtual Interface creation failed 2018-04-17 07:16:13.354 1 ERROR nova.compute.manager [instance: 84d16a42-8ab3-4b30-836b-c161d95ddd82] Traceback (most recent call last): 2018-04-17 07:16:13.354 1 ERROR nova.compute.manager [instance: 84d16a42-8ab3-4b30-836b-c161d95ddd82] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2238, in _build_resources 2018-04-17 07:16:13.354 1 ERROR nova.compute.manager [instance: 84d16a42-8ab3-4b30-836b-c161d95ddd82] yield resources 2018-04-17 07:16:13.354 1 ERROR nova.compute.manager [instance: 84d16a42-8ab3-4b30-836b-c161d95ddd82] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2022, in _build_and_run_instance 2018-04-17 07:16:13.354 1 ERROR nova.compute.manager [instance: 84d16a42-8ab3-4b30-836b-c161d95ddd82] block_device_info=block_device_info) 2018-04-17 07:16:13.354 1 ERROR nova.compute.manager [instance: 84d16a42-8ab3-4b30-836b-c161d95ddd82] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 3066, in spawn 2018-04-17 07:16:13.354 1 ERROR nova.compute.manager [instance: 84d16a42-8ab3-4b30-836b-c161d95ddd82] destroy_disks_on_failure=True) 2018-04-17 07:16:13.354 1 ERROR nova.compute.manager [instance: 84d16a42-8ab3-4b30-836b-c161d95ddd82] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5518, in _create_domain_and_network 2018-04-17 07:16:13.354 1 ERROR nova.compute.manager [instance: 84d16a42-8ab3-4b30-836b-c161d95ddd82] raise exception.VirtualInterfaceCreateException() 2018-04-17 07:16:13.354 1 ERROR nova.compute.manager [instance: 84d16a42-8ab3-4b30-836b-c161d95ddd82] VirtualInterfaceCreateException: Virtual Interface creation failed (In reply to Eran Kuris from comment #4) > cat /etc/yum.repos.d/latest-installed > 13 -p 2018-04-10.2 > > Manual steps to reproduce on OVN-HA with DVR and VLAN configuration: > > 1.openstack network create net-64-2 > 2.openstack subnet create --subnet-range 10.0.2.0/24 --network net-64-2 > --dhcp subnet_4_2 > 3.openstack network create net-64-1 > 4.openstack subnet create --subnet-range 10.0.1.0/24 --network net-64-1 > --dhcp subnet_4_1 > 5.openstack router create Router_eNet > 6.openstack router add subnet Router_eNet subnet_4_2 > 7.openstack router add subnet Router_eNet subnet_4_1 > 8.openstack router set --external-gateway nova Router_eNet > > > in nova-compute there is error: > 2018-04-17 07:16:13.354 1 ERROR nova.compute.manager > [req-54f38f84-ef69-4e5c-832e-62a4bde2c526 dd50544ada704bd28354e42a36b9137d > a3b8b186235a4d1cbffc9305765e9888 - default default] [instance: > 84d16a42-8ab3-4b30-836b-c161d95ddd82] Instance failed to spawn: > VirtualInterfaceCreateException: Virtual Interface creation failed > 2018-04-17 07:16:13.354 1 ERROR nova.compute.manager [instance: > 84d16a42-8ab3-4b30-836b-c161d95ddd82] Traceback (most recent call last): > 2018-04-17 07:16:13.354 1 ERROR nova.compute.manager [instance: > 84d16a42-8ab3-4b30-836b-c161d95ddd82] File > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2238, in > _build_resources > 2018-04-17 07:16:13.354 1 ERROR nova.compute.manager [instance: > 84d16a42-8ab3-4b30-836b-c161d95ddd82] yield resources > 2018-04-17 07:16:13.354 1 ERROR nova.compute.manager [instance: > 84d16a42-8ab3-4b30-836b-c161d95ddd82] File > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2022, in > _build_and_run_instance > 2018-04-17 07:16:13.354 1 ERROR nova.compute.manager [instance: > 84d16a42-8ab3-4b30-836b-c161d95ddd82] > block_device_info=block_device_info) > 2018-04-17 07:16:13.354 1 ERROR nova.compute.manager [instance: > 84d16a42-8ab3-4b30-836b-c161d95ddd82] File > "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 3066, > in spawn > 2018-04-17 07:16:13.354 1 ERROR nova.compute.manager [instance: > 84d16a42-8ab3-4b30-836b-c161d95ddd82] destroy_disks_on_failure=True) > 2018-04-17 07:16:13.354 1 ERROR nova.compute.manager [instance: > 84d16a42-8ab3-4b30-836b-c161d95ddd82] File > "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5518, > in _create_domain_and_network > 2018-04-17 07:16:13.354 1 ERROR nova.compute.manager [instance: > 84d16a42-8ab3-4b30-836b-c161d95ddd82] raise > exception.VirtualInterfaceCreateException() > 2018-04-17 07:16:13.354 1 ERROR nova.compute.manager [instance: > 84d16a42-8ab3-4b30-836b-c161d95ddd82] VirtualInterfaceCreateException: > Virtual Interface creation failed Adding more to Eran's comment here. I was investigating this a problem at Eran's environment today and I noticed that the VirtualInterfaceCreateException was being raised due to a timeout waiting on the Neutron port to become UP [0]. Tracing the logs in Neutron to figure out why the port was marked as DOWN I found this error in one of the controllers node referent to the port UUID of the instance that I was tracing [1]. The IDL from OVSDB monitor was raising [1] indicating that it didn't have a valid lock [2]. That error also prevented the set_port_state_up() method to be invoked [3] which would then mark the neutron provision block as completed [4] and change the port status to UP. Note that, only one controller received the notification to set the status for that specific port to UP, I grepped the other two controllers looking for the "OVN reports status up for port: <port uuid>" log message but it didn't have any output. This seems to be a problem with the locking in the python code for OVSDB. This patch upstream might be relevant for this bug: https://review.openstack.org/#/c/560295/ [0] https://github.com/openstack/nova/blob/stable/queens/nova/virt/libvirt/driver.py#L5580-L5593 [1] http://paste.openstack.org/show/719365/ [2] A OVSDB lock is used to make sure we only have 1 instance of OVSDB Monitor running in the entire cluster [3] https://github.com/openstack/networking-ovn/blob/df880435d3a7e98aaa8a0ba3a0a9b81286129586/networking_ovn/ovsdb/ovsdb_monitor.py#L106 [4] https://github.com/openstack/networking-ovn/blob/df880435d3a7e98aaa8a0ba3a0a9b81286129586/networking_ovn/ml2/mech_driver.py#L707-L711 Hi Eran, This patch here should fix this problem and unblock you https://review.openstack.org/#/c/561995 I will backport it to OSP 13.0 as soon as we get the fix merged! It's a was a tricky problem... patch proposed to stable/queens upstream and is merged on master. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:2086 |