Bug 2053137

Summary: Load balancer creation often failing due to logical switch not found
Product: Red Hat OpenStack Reporter: Maysa Macedo <mdemaced>
Component: python-ovn-octavia-providerAssignee: Fernando Royo <froyo>
Status: CLOSED ERRATA QA Contact: Eran Kuris <ekuris>
Severity: high Docs Contact:
Priority: high    
Version: 16.2 (Train)CC: itbrown, juriarte, ltomasbo, mdulko
Target Milestone: gaKeywords: Triaged
Target Release: 17.0Flags: mdemaced: needinfo-
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: python-ovn-octavia-provider-1.0.1-0.20220330170821.c16570e.el9ost Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of:
: 2057423 (view as bug list) Environment:
Last Closed: 2022-09-21 12:18:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2057423, 2057424    

Description Maysa Macedo 2022-02-10 15:34:11 UTC
Description of problem:

Kuryr connects all the Openshift Namespace (Neutron Subnetes), Services (Load Balancer), Nodes (Servers) with one Router. It seems possible that a load-balancer creation was triggered while multiple Subnets were being deleted, causing an exception of logical switch not found and moving the load-balancer to ERROR state. Kuryr will keep retrying to recreate this load-balancer, but the same issue can happen again.

2022-02-10 14:48:49.115 16 ERROR ovsdbapp.backend.ovs_idl.transaction [-] Traceback (most recent call last):
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver [-] Exception occurred during creation of loadbalancer: ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Logical_Switch with name=neutron-6fa06cae-0145-4571-9919-0541f0bea93a
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver Traceback (most recent call last):
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver   File "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 111, in transaction
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver     yield self._nested_txns_map[cur_thread_id]
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver KeyError: 139860903978752
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver 
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver During handling of the above exception, another exception occurred:
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver 
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver Traceback (most recent call last):
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver   File "/usr/lib/python3.6/site-packages/networking_ovn/octavia/ovn_driver.py", line 1033, in lb_create
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver     self._execute_commands(commands)
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver   File "/usr/lib/python3.6/site-packages/networking_ovn/octavia/ovn_driver.py", line 626, in _execute_commands
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver     txn.add(command)
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver   File "/usr/lib64/python3.6/contextlib.py", line 88, in __exit__
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver     next(self.gen)
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver   File "/usr/lib/python3.6/site-packages/networking_ovn/ovsdb/impl_idl_ovn.py", line 252, in transaction
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver     yield t
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver   File "/usr/lib64/python3.6/contextlib.py", line 88, in __exit__
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver     next(self.gen)
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver   File "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 119, in transaction
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver     del self._nested_txns_map[cur_thread_id]
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver   File "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 69, in __exit__
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver     self.result = self.commit()
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver   File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 62, in commit
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver     raise result.ex
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver   File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/connection.py", line 128, in run
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver     txn.results.put(txn.do_commit())
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver   File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 86, in do_commit
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver     command.run_idl(txn)
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver   File "/usr/lib/python3.6/site-packages/ovsdbapp/schema/ovn_northbound/commands.py", line 1159, in run_idl
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver     ls = self.api.lookup('Logical_Switch', self.switch)
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver   File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", line 172, in lookup
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver     return self._lookup(table, record)
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver   File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", line 215, in _lookup
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver     row = idlutils.row_by_value(self, rl.table, rl.column, record)
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver   File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/idlutils.py", line 130, in row_by_value
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver     raise RowNotFound(table=table, col=column, match=match)
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Logical_Switch with name=neutron-6fa06cae-0145-4571-9919-0541f0bea93a

Version-Release number of selected component (if applicable):

Red Hat OpenStack Platform release 16.2.1 GA (Train) - 16.2_20211110.2

ovn-2021-21.09.0-20.el8fdp.x86_64
rhosp-ovn-2021-4.el8ost.1.noarch
ovn-2021-host-21.09.0-20.el8fdp.x86_64
rhosp-ovn-host-2021-4.el8ost.1.noarch

OCP 4.8.0-0.nightly-2022-02-07-190543

How reproducible:

Steps to Reproduce:
1. Run Kubernetes conformance tests
2.
3.

Actual results:


Expected results:


Additional info:

Comment 12 Jon Uriarte 2022-08-31 15:14:07 UTC
Verified in RHOS-17.0-RHEL-9-20220825.n.1 after running kubernetes conformance tests on top of OCP 4.11.2 with Kuryr.

Another issue has been noticed though and has been reported in bug 2123014.

Note that swift is not working:
$ openstack container list                                                                                                                                                                                
Service Unavailable (HTTP 503)

but the installation progressed successfully.

Result of 4.11 conformance tests [1]:
error: 310 fail, 1100 pass, 1709 skip (4h36m51s)

[1] http://rhos-ci-logs.lab.eng.tlv2.redhat.com/logs/j2pg/DFG-osasinfra-shiftstack_ci-ocp_testing/366/

Comment 16 errata-xmlrpc 2022-09-21 12:18:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 17.0 (Wallaby)), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2022:6543