Bug 1874439 - [Kuryr] kuryr-controller restarts due to attempt to change namespaces that already terminated
Summary: [Kuryr] kuryr-controller restarts due to attempt to change namespaces that al...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.6
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.6.0
Assignee: Michał Dulko
QA Contact: GenadiC
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-01 11:11 UTC by Jon Uriarte
Modified: 2020-10-27 16:36 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-27 16:36:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Kuryr controller logs (48.39 KB, text/plain)
2020-09-01 11:11 UTC, Jon Uriarte
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift kuryr-kubernetes pull 335 0 None closed Bug 1874439: Ignore CRD creation errors when ns is terminated 2020-09-25 07:56:53 UTC
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:36:47 UTC

Description Jon Uriarte 2020-09-01 11:11:55 UTC
Created attachment 1713283 [details]
Kuryr controller logs

Description of problem:

Consecutive kuryr_kubernetes.exceptions.K8sNamespaceTerminating: Forbidden: 'Namespace already terminated' exceptions during conformance tests make the kuryr-controller pod restart.

2020-09-01 10:27:24.343 1 ERROR kuryr_kubernetes.controller.handlers.lbaas [-] Kubernetes Client Exception creating kuryrloadbalancer CRD. <class 'kuryr_kubernetes.exceptions.K8sClientException'>: kuryr_kubernetes.exceptions.K8sNamespaceTerminating: Forbidden: 'Namespace already terminated: \'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"kuryrloadbalancers.openstack.org \\\\"test-service\\\\" is forbidden: unable to create new content in namespace e2e-nsdeletetest-6977 because it is being terminated","reason":"Forbidden","details":{"name":"test-service","group":"openstack.org","kind":"kuryrloadbalancers","causes":[{"reason":"NamespaceTerminating","message":"namespace e2e-nsdeletetest-6977 is being terminated","field":"metadata.namespace"}]},"code":403}\\n\''
2020-09-01 10:27:24.343 1 ERROR kuryr_kubernetes.controller.handlers.lbaas Traceback (most recent call last):
2020-09-01 10:27:24.343 1 ERROR kuryr_kubernetes.controller.handlers.lbaas   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/handlers/lbaas.py", line 171, in create_crd_spec
2020-09-01 10:27:24.343 1 ERROR kuryr_kubernetes.controller.handlers.lbaas     loadbalancer_crd)
2020-09-01 10:27:24.343 1 ERROR kuryr_kubernetes.controller.handlers.lbaas   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/k8s_client.py", line 210, in post
2020-09-01 10:27:24.343 1 ERROR kuryr_kubernetes.controller.handlers.lbaas     self._raise_from_response(response)
2020-09-01 10:27:24.343 1 ERROR kuryr_kubernetes.controller.handlers.lbaas   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/k8s_client.py", line 88, in _raise_from_response
2020-09-01 10:27:24.343 1 ERROR kuryr_kubernetes.controller.handlers.lbaas     raise exc.K8sNamespaceTerminating(response.text)
2020-09-01 10:27:24.343 1 ERROR kuryr_kubernetes.controller.handlers.lbaas kuryr_kubernetes.exceptions.K8sNamespaceTerminating: Forbidden: 'Namespace already terminated: \'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"kuryrloadbalancers.openstack.org \\\\"test-service\\\\" is forbidden: unable to create new content in namespace e2e-nsdeletetest-6977 because it is being terminated","reason":"Forbidden","details":{"name":"test-service","group":"openstack.org","kind":"kuryrloadbalancers","causes":[{"reason":"NamespaceTerminating","message":"namespace e2e-nsdeletetest-6977 is being terminated","field":"metadata.namespace"}]},"code":403}\\n\''


Version-Release number of selected component (if applicable):

4.6.0-0.nightly-2020-08-31-220837
RHOS-16.1-RHEL-8-20200821.n.0


How reproducible: always running conformance tests


Steps to Reproduce:
1. Install 4.6 on OSP 16.1 with OVN
2. Run conformance tests

Actual results:
kuryr_kubernetes.exceptions.K8sNamespaceTerminating exception is continuously raised and kuryr-controller is restarted


Expected results: no kuryr-controller restarts due to that exception


Additional info:

Conformance test results: error: 44 fail, 257 pass, 1 skip (1h31m10s)

$ oc -n openshift-kuryr get pods
NAME                                READY   STATUS    RESTARTS   AGE
kuryr-cni-5nj6l                     1/1     Running   8          3h58m
kuryr-cni-7xp7x                     1/1     Running   8          3h57m
kuryr-cni-9stvw                     1/1     Running   0          4h20m
kuryr-cni-mtxd4                     1/1     Running   0          4h20m
kuryr-cni-plcxk                     1/1     Running   4          3h58m
kuryr-cni-rv5ff                     1/1     Running   0          4h20m
kuryr-controller-66d4854f56-td7cc   1/1     Running   14         4h20m

Comment 3 rlobillo 2020-09-07 16:04:26 UTC
Verified on 4.6.0-0.nightly-2020-09-05-015624 over RHOS-16.1-RHEL-8-20200831.n.1

OCP installed with IPI and run NP and conformance tests with expected results.

The error is observed and captured so no kuryr-controller restarts are observed due to 'Namespace already terminated' error:

2020-09-07 13:05:12.764 1 ERROR kuryr_kubernetes.controller.handlers.lbaas [-] Kubernetes Client Exception creating kuryrloadbalancer CRD. <class 'kuryr_kubernetes.exceptions.K8sClientException'>: kuryr_kubernetes.exceptions.K8sNamespaceTerminating: Forbidden: 'Namespace already terminated: \'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"kuryrloadbalancers.openstack.org \\\\"latency-svc-tcrdr\\\\" is forbidden: unable to create new content in namespace e2e-svc-latency-4916 because it is being terminated","reason":"Forbidden","details":{"name":"latency-svc-tcrdr","group":"openstack.org","kind":"kuryrloadbalancers","causes":[{"reason":"NamespaceTerminating","message":"namespace e2e-svc-latency-4916 is being terminated","field":"metadata.namespace"}]},"code":403}\\n\''
2020-09-07 13:05:12.764 1 ERROR kuryr_kubernetes.controller.handlers.lbaas Traceback (most recent call last):
2020-09-07 13:05:12.764 1 ERROR kuryr_kubernetes.controller.handlers.lbaas   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/handlers/lbaas.py", line 353, in _create_crd_spec
2020-09-07 13:05:12.764 1 ERROR kuryr_kubernetes.controller.handlers.lbaas     k_const.K8S_API_CRD_NAMESPACES, namespace), loadbalancer_crd)
2020-09-07 13:05:12.764 1 ERROR kuryr_kubernetes.controller.handlers.lbaas   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/k8s_client.py", line 210, in post
2020-09-07 13:05:12.764 1 ERROR kuryr_kubernetes.controller.handlers.lbaas     self._raise_from_response(response)
2020-09-07 13:05:12.764 1 ERROR kuryr_kubernetes.controller.handlers.lbaas   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/k8s_client.py", line 88, in _raise_from_response
2020-09-07 13:05:12.764 1 ERROR kuryr_kubernetes.controller.handlers.lbaas     raise exc.K8sNamespaceTerminating(response.text)
2020-09-07 13:05:12.764 1 ERROR kuryr_kubernetes.controller.handlers.lbaas kuryr_kubernetes.exceptions.K8sNamespaceTerminating: Forbidden: 'Namespace already terminated: \'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"kuryrloadbalancers.openstack.org \\\\"latency-svc-tcrdr\\\\" is forbidden: unable to create new content in namespace e2e-svc-latency-4916 because it is being terminated","reason":"Forbidden","details":{"name":"latency-svc-tcrdr","group":"openstack.org","kind":"kuryrloadbalancers","causes":[{"reason":"NamespaceTerminating","message":"namespace e2e-svc-latency-4916 is being terminated","field":"metadata.namespace"}]},"code":403}\\n\''
2020-09-07 13:05:12.764 1 ERROR kuryr_kubernetes.controller.handlers.lbaas ESC[00m
2020-09-07 13:05:12.765 1 WARNING kuryr_kubernetes.controller.handlers.lbaas [-] Namespace e2e-svc-latency-4916 is being terminated, ignoring Endpoints latency-svc-tcrdr in that namespace.: kuryr_kubernetes.exceptions.K8sNamespaceTerminating: Forbidden: 'Namespace already terminated: \'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"kuryrloadbalancers.openstack.org \\\\"latency-svc-tcrdr\\\\" is forbidden: unable to create new content in namespace e2e-svc-latency-4916 because it is being terminated","reason":"Forbidden","details":{"name":"latency-svc-tcrdr","group":"openstack.org","kind":"kuryrloadbalancers","causes":[{"reason":"NamespaceTerminating","message":"namespace e2e-svc-latency-4916 is being terminated","field":"metadata.namespace"}]},"code":403}\\n\''ESC[00m

It is mentioned on  https://bugzilla.redhat.com/show_bug.cgi?id=1860030 a rework regarding the logs so only the warning message is shown.

As the kuryr-controller is not anymore restarted due to this error and the functionality is OK, this BZ is verified.

Comment 5 errata-xmlrpc 2020-10-27 16:36:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.