Bug 2123014

Summary: Unable to complete OVSDB transaction due to timeout on the TRY_AGAIN retries
Product: Red Hat OpenStack Reporter: Jon Uriarte <juriarte>
Component: python-ovsdbappAssignee: Terry Wilson <twilson>
Status: CLOSED ERRATA QA Contact: Alex Katz <akatz>
Severity: high Docs Contact:
Priority: high    
Version: 17.0 (Wallaby)CC: akatz, chjones, chrisw, jniu, jschluet, mdemaced, mdulko, scohen, spower, twilson
Target Milestone: gaKeywords: Triaged
Target Release: 17.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: python-ovsdbapp-1.9.3-0.20220630180356.f804411.el9ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2128911 (view as bug list) Environment:
Last Closed: 2022-09-21 12:24:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2128911, 2128914    

Description Jon Uriarte 2022-08-31 15:00:07 UTC
Description of problem:

While verifying bug 2053137 a different issue has been noticed because some LBs remain in pending-delete status.

ovn-octavia is unable to perform some transactions into OVN DB leading load balancers to pending-xx status.

2022-08-30 17:53:14.359 9 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] OVSDB transaction returned TRY_AGAIN, retrying do_commit /usr/lib/python3.9/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:107                               
[...]
2022-08-30 17:53:22.366 9 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=12 command(idx=0): LsLbAddCommand(switch=c8622209-e75d-4071-b899-f9f45b3f2213, lb=32be21f2-e7f8-49c8-aa92-29650b84b3ae, may_exist=True) do_commit /usr/lib/python3.9/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:90
2022-08-30 17:53:22.367 9 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] OVSDB transaction returned TRY_AGAIN, retrying do_commit /usr/lib/python3.9/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:107 
[...]
2022-08-30 17:55:06.517 9 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=25 command(idx=0): LsLbAddCommand(switch=c8622209-e75d-4071-b899-f9f45b3f2213, lb=32be21f2-e7f8-49c8-aa92-29650b84b3ae, may_exist=True) do_commit /usr/lib/python3.9/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:90
2022-08-30 17:55:06.518 9 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] OVSDB transaction returned TRY_AGAIN, retrying do_commit /usr/lib/python3.9/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:107                               
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper [-] Unexpected exception in request_handler: ovsdbapp.exceptions.TimeoutException: Commands [LsLbAddCommand(switch=c8622209-e75d-4071-b899-f9f45b3f2213, lb=32be21f2-e7f8-49c8-aa9
2-29650b84b3ae, may_exist=True)] exceeded timeout 180 seconds, cause: Result queue is empty
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper Traceback (most recent call last):
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper   File "/usr/lib/python3.9/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 55, in commit                                                                            
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper     result = self.results.get(timeout=self.timeout)
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper   File "/usr/lib64/python3.9/queue.py", line 179, in get
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper     raise Empty
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper _queue.Empty
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper The above exception was the direct cause of the following exception:                                                                                                             
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper Traceback (most recent call last):
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper   File "/usr/lib/python3.9/site-packages/ovn_octavia_provider/helper.py", line 345, in request_handler                                                                           
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper     status = request_handler(request['info'])
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper   File "/usr/lib/python3.9/site-packages/ovn_octavia_provider/helper.py", line 264, in lb_create_lrp_assoc                                                                       
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper     self._update_lb_to_ls_association(
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper   File "/usr/lib/python3.9/site-packages/tenacity/__init__.py", line 333, in wrapped_f                                                                                           
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper     return self(f, *args, **kw)
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper   File "/usr/lib/python3.9/site-packages/tenacity/__init__.py", line 423, in __call__                                                                                            
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper     do = self.iter(retry_state=retry_state)
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper   File "/usr/lib/python3.9/site-packages/tenacity/__init__.py", line 360, in iter                                                                                                
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper     return fut.result()
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper   File "/usr/lib64/python3.9/concurrent/futures/_base.py", line 439, in result                                                                                                   
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper     return self.__get_result()
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper   File "/usr/lib64/python3.9/concurrent/futures/_base.py", line 391, in __get_result                                                                                             
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper     raise self._exception
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper   File "/usr/lib/python3.9/site-packages/tenacity/__init__.py", line 426, in __call__                                                                                            
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper     result = fn(*args, **kwargs)
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper   File "/usr/lib/python3.9/site-packages/ovn_octavia_provider/helper.py", line 520, in _update_lb_to_ls_association                                                              
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper     self._execute_commands(commands)
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper   File "/usr/lib/python3.9/site-packages/ovn_octavia_provider/helper.py", line 505, in _execute_commands                                                                         
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper     txn.add(command)
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper   File "/usr/lib64/python3.9/contextlib.py", line 126, in __exit__                                                                                                               
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper     next(self.gen)
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper   File "/usr/lib/python3.9/site-packages/ovn_octavia_provider/ovsdb/impl_idl_ovn.py", line 165, in transaction                                                                   
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper     yield t
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper   File "/usr/lib64/python3.9/contextlib.py", line 126, in __exit__                                                                                                               
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper     next(self.gen)
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper   File "/usr/lib/python3.9/site-packages/ovsdbapp/api.py", line 110, in transaction                                                                                              
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper     del self._nested_txns_map[cur_thread_id]
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper   File "/usr/lib/python3.9/site-packages/ovsdbapp/api.py", line 61, in __exit__                                                                                                  
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper     self.result = self.commit()
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper   File "/usr/lib/python3.9/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 57, in commit                                                                            
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper     raise exceptions.TimeoutException(
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper ovsdbapp.exceptions.TimeoutException: Commands [LsLbAddCommand(switch=c8622209-e75d-4071-b899-f9f45b3f2213, lb=32be21f2-e7f8-49c8-aa92-29650b84b3ae, may_exist=True)] exceeded tim
eout 180 seconds, cause: Result queue is empty
2022-08-30 17:55:11.221 9 ERROR ovn_octavia_provider.helper


It looks like the issue happens after a leader election:
(from /var//log/containers/openvswitch/ovsdb-server-nb.log.1)
[...]
2022-08-30T17:52:10.543Z|05784|raft|INFO|received leadership transfer from 07d8 in term 47
2022-08-30T17:52:10.543Z|05785|raft|INFO|term 48: starting election
[...]
2022-08-30T17:52:10.645Z|05799|raft|INFO|term 48: elected leader by 2+ of 3 servers
[...]

Version-Release number of selected component (if applicable):

OSP RHOS-17.0-RHEL-9-20220825.n.1


How reproducible: always (two times in different environment)


Steps to Reproduce:
1. Install OCP 4.11 with kuryr sdn
2. Run kubernetes conformance tests

Actual results: LBs in pending_delete status


Expected results: LBs correctly updated and deleted


Additional info: not seen in 16.x for now

Comment 15 errata-xmlrpc 2022-09-21 12:24:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 17.0 (Wallaby)), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2022:6543