Bug 2094816 - Kuryr controller restarts when over quota
Summary: Kuryr controller restarts when over quota
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.11
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.11.0
Assignee: Michał Dulko
QA Contact: Itzik Brown
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-06-08 10:59 UTC by Itzik Brown
Modified: 2022-08-10 11:17 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-10 11:16:53 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift kuryr-kubernetes pull 675 0 None open Bug 2094816: Do not crash on Neutron quota exceptions 2022-06-22 14:32:14 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 11:17:06 UTC

Description Itzik Brown 2022-06-08 10:59:00 UTC
Description of problem:
When a request if made and Neutron returns over quota error - Kuryr restarts.

From the controller log:
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health [-] Component KuryrPortHandler is dead. Last caught exception below: openstack.exceptions.SDKException: Error when bulk creating ports: {
"NeutronError": {"type": "OverQuota", "message": "Quota exceeded for resources: ['port'].", "detail": ""}}
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health Traceback (most recent call last):
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/vif_pool.py", line 242, in _get_port_from_pool
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health     port_id = pool_ports[security_groups].pop()
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health KeyError: ('e1edfb10-83af-4719-ae0b-6b6871d5fd8a',)
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health During handling of the above exception, another exception occurred:
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health Traceback (most recent call last):
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/vif_pool.py", line 218, in request_vif
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health     tuple(sorted(security_groups)))
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/vif_pool.py", line 255, in _get_port_from_pool
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health     raise exceptions.ResourceNotReady(pod)
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health kuryr_kubernetes.exceptions.ResourceNotReady: Resource not ready: 'Pod e2e-cronjob-9131/failed-jobs-history-limit-27577989-jn6sd'
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health During handling of the above exception, another exception occurred:
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health Traceback (most recent call last):
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/handlers/logging.py", line 38, in __call__
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health     self._handler(event, *args, **kwargs)
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/handlers/retry.py", line 85, in __call__
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health     self._handler(event, *args, retry_info=info, **kwargs)
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/handlers/k8s_base.py", line 90, in __call__
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health     self.on_present(obj, *args, **kwargs)
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/handlers/kuryrport.py", line 70, in on_present
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health     if not self.get_vifs(kuryrport_crd):
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/handlers/kuryrport.py", line 263, in get_vifs
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health     security_groups)
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/vif_pool.py", line 1222, in request_vif
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health     pod, project_id, subnets, security_groups)
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/vif_pool.py", line 222, in request_vif
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health     tuple(sorted(security_groups))):
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/vif_pool.py", line 297, in _populate_pool
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health     semaphore=self._create_ports_semaphore)
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/nested_vlan_vif.py", line 93, in request_vifs
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health     ports = list(os_net.create_ports(bulk_port_rq))
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/clients.py", line 104, in _create_ports
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health     response.text)
2022-06-08 09:14:42.306 1 ERROR kuryr_kubernetes.controller.managers.health openstack.exceptions.SDKException: Error when bulk creating ports: {"NeutronError": {"type": "OverQuota", "message": "Quota exceeded for
resources: ['port'].", "detail": ""}}

Version-Release number of selected component (if applicable):
4.11.0-0.nightly-2022-06-06-025509

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 4 Itzik Brown 2022-06-28 08:48:04 UTC
Checked with 
OCP 4.11.0-0.nightly-2022-06-25-132614
OSP RHOS-16.2-RHEL-8-20220610.n.1

1. Set the networks quota to 101 (As admin)
$ openstack quota set  --networks 101 <project-id>
2. Created networks until got Quota exceeded error
3. Create a project and a deployment
$ oc new-project demo1
$ oc create deployment  --image quay.io/kuryr/demo demo

4. Checked that there is Quota exceeded error in Kuryr controller log
$ oc logs kuryr-controller-647d9cdbd4-xtvqj -n openshift-kuryr

2022-06-28 08:31:43.629 1 ERROR kuryr_kubernetes.controller.drivers.namespace_subnet openstack.exceptions.ConflictException: ConflictException: 409: Client Error for url: ... Quota exceeded for resources: ['network'].

5. Checked that the Kuryr controller was not restarted

Comment 5 errata-xmlrpc 2022-08-10 11:16:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.