Bug 2002548

Summary: [Kuryr][3.11] Kuryr Controller never becomes ready on large scale environments
Product: OpenShift Container Platform Reporter: Robin Cernin <rcernin>
Component: NetworkingAssignee: Robin Cernin <rcernin>
Networking sub component: kuryr QA Contact: Itzik Brown <itbrown>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: unspecified CC: itbrown
Version: 3.11.0Keywords: Triaged
Target Milestone: ---   
Target Release: 3.11.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-10-28 15:58:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Robin Cernin 2021-09-09 07:56:12 UTC
Kuryr Controller uses OpenStack API to fetch ports, in large environments it takes more than 30s and times out. This patch switches the behaviour to use of quota client.

Comment 4 Itzik Brown 2021-09-30 13:28:41 UTC
Checked with v3.11.524
Kuryr controller doesn't become ready after restart 
Neutron server returns request_ids: ['req-9e971162-e8d7-4f54-8aec-e6f343216428']
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging Traceback (most recent call last):
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/handlers/logging.py", line 37, in __call__
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     self._handler(event)
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/handlers/retry.py", line 78, in __call__
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     self._handler(event)
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/handlers/k8s_base.py", line 77, in __call__
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     self.on_deleted(obj)
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/handlers/vif.py", line 207, in on_deleted
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     security_groups)
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/drivers/vif_pool.py", line 1087, in release_vif
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     self._vif_drvs[vif_drv_alias].release_vif(pod, vif, *argv)
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/drivers/vif_pool.py", line 117, in release_vif
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     self._drv_vif.release_vif(pod, vif, *argv)
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/drivers/nested_vlan_vif.py", line 119, in release_vif
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     self._remove_subport(neutron, trunk_id, vif.id)
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/drivers/nested_vlan_vif.py", line 234, in _remove_subport
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     self._remove_subports(neutron, trunk_id, [subport_id])
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/drivers/nested_vlan_vif.py", line 227, in _remove_subports
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     {'sub_ports': subports_body})
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 2117, in trunk_remove_subports
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     return self.put(self.subports_remove_path % (trunk), body=body)
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 363, in put
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     headers=headers, params=params)
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 331, in retry_request
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     headers=headers, params=params)
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 294, in do_request
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     self._handle_fault_response(status_code, replybody, resp)
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 269, in _handle_fault_response
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     exception_handler_v20(status_code, error_body)
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 93, in exception_handler_v20
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     request_ids=request_ids)
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging NotFound: SubPort on trunk 50178d16-7207-48f0-801f-25acddf580fb with parent port 39ad87fa-e476-4cde-9b6e-0b7b0d2302a8 could not be found.
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging Neutron server returns request_ids: ['req-9e971162-e8d7-4f54-8aec-e6f343216428']
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging [00m
2021-09-30 11:37:50.202 1 INFO werkzeug [-] 192.168.99.21 - - [30/Sep/2021 11:37:50] "GET /alive HTTP/1.1" 500 -[00m
2021-09-30 11:37:50.445 1 ERROR kuryr_kubernetes.controller.managers.health [-] Error when processing neutron request 'QuotaDetails' object has no attribute '__getitem__': TypeError: 'QuotaDetails' object has no attribute '__getitem__'

Comment 5 Itzik Brown 2021-10-05 12:17:57 UTC
I checked with v3.11.524 and the latest Kuryr controller image at that time (06beaa8865808)

Created a network and created 3000 ports on this network.
Deleted the controller pod and checked that a new one is created and ready.

Comment 7 Itzik Brown 2021-10-06 14:39:23 UTC
Verified with v3.11.525

Comment 10 errata-xmlrpc 2021-10-28 15:58:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 3.11.542 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3915