Bug 2002548 - [Kuryr][3.11] Kuryr Controller never becomes ready on large scale environments
Summary: [Kuryr][3.11] Kuryr Controller never becomes ready on large scale environments
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 3.11.z
Assignee: Robin Cernin
QA Contact: Itzik Brown
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-09-09 07:56 UTC by Robin Cernin
Modified: 2021-10-28 15:58 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-10-28 15:58:21 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift kuryr-kubernetes pull 558 0 None Merged Bug 2002548: Fix health checks on large scale environments 2021-09-14 04:31:43 UTC
Github openshift kuryr-kubernetes pull 572 0 None open Bug 2002548: Ensure quotas values are fetched as attributes 2021-09-30 13:44:35 UTC
Red Hat Product Errata RHSA-2021:3915 0 None None None 2021-10-28 15:58:27 UTC

Description Robin Cernin 2021-09-09 07:56:12 UTC
Kuryr Controller uses OpenStack API to fetch ports, in large environments it takes more than 30s and times out. This patch switches the behaviour to use of quota client.

Comment 4 Itzik Brown 2021-09-30 13:28:41 UTC
Checked with v3.11.524
Kuryr controller doesn't become ready after restart 
Neutron server returns request_ids: ['req-9e971162-e8d7-4f54-8aec-e6f343216428']
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging Traceback (most recent call last):
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/handlers/logging.py", line 37, in __call__
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     self._handler(event)
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/handlers/retry.py", line 78, in __call__
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     self._handler(event)
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/handlers/k8s_base.py", line 77, in __call__
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     self.on_deleted(obj)
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/handlers/vif.py", line 207, in on_deleted
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     security_groups)
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/drivers/vif_pool.py", line 1087, in release_vif
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     self._vif_drvs[vif_drv_alias].release_vif(pod, vif, *argv)
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/drivers/vif_pool.py", line 117, in release_vif
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     self._drv_vif.release_vif(pod, vif, *argv)
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/drivers/nested_vlan_vif.py", line 119, in release_vif
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     self._remove_subport(neutron, trunk_id, vif.id)
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/drivers/nested_vlan_vif.py", line 234, in _remove_subport
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     self._remove_subports(neutron, trunk_id, [subport_id])
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/drivers/nested_vlan_vif.py", line 227, in _remove_subports
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     {'sub_ports': subports_body})
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 2117, in trunk_remove_subports
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     return self.put(self.subports_remove_path % (trunk), body=body)
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 363, in put
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     headers=headers, params=params)
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 331, in retry_request
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     headers=headers, params=params)
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 294, in do_request
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     self._handle_fault_response(status_code, replybody, resp)
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 269, in _handle_fault_response
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     exception_handler_v20(status_code, error_body)
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 93, in exception_handler_v20
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging     request_ids=request_ids)
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging NotFound: SubPort on trunk 50178d16-7207-48f0-801f-25acddf580fb with parent port 39ad87fa-e476-4cde-9b6e-0b7b0d2302a8 could not be found.
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging Neutron server returns request_ids: ['req-9e971162-e8d7-4f54-8aec-e6f343216428']
2021-09-30 11:37:50.138 1 ERROR kuryr_kubernetes.handlers.logging [00m
2021-09-30 11:37:50.202 1 INFO werkzeug [-] 192.168.99.21 - - [30/Sep/2021 11:37:50] "GET /alive HTTP/1.1" 500 -[00m
2021-09-30 11:37:50.445 1 ERROR kuryr_kubernetes.controller.managers.health [-] Error when processing neutron request 'QuotaDetails' object has no attribute '__getitem__': TypeError: 'QuotaDetails' object has no attribute '__getitem__'

Comment 5 Itzik Brown 2021-10-05 12:17:57 UTC
I checked with v3.11.524 and the latest Kuryr controller image at that time (06beaa8865808)

Created a network and created 3000 ports on this network.
Deleted the controller pod and checked that a new one is created and ready.

Comment 7 Itzik Brown 2021-10-06 14:39:23 UTC
Verified with v3.11.525

Comment 10 errata-xmlrpc 2021-10-28 15:58:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 3.11.542 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3915


Note You need to log in before you can comment on or make changes to this bug.