Bug 1839023 - [Kuryr] LB sg update not skipped when no endpoint is found
Summary: [Kuryr] LB sg update not skipped when no endpoint is found
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.5
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.4.z
Assignee: Maysa Macedo
QA Contact: GenadiC
URL:
Whiteboard:
Depends On: 1838985
Blocks: 1846228
TreeView+ depends on / blocked
 
Reported: 2020-05-22 10:54 UTC by Maysa Macedo
Modified: 2020-06-23 00:57 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of: 1838985
Environment:
Last Closed: 2020-06-23 00:57:26 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github openshift kuryr-kubernetes pull 242 None closed [release-4.4] Bug 1839023:Skip LB sg update when no endpoint is found 2020-06-22 08:48:28 UTC
Red Hat Product Errata RHBA-2020:2580 None None None 2020-06-23 00:57:45 UTC

Description Maysa Macedo 2020-05-22 10:54:33 UTC
+++ This bug was initially created as a clone of Bug #1838985 +++

Description of problem:

When a pod event is handled it's possible that a Network Policy and Service
are affected by that pod and the LB sg of selected services needs to be
updated. However, the endpoints for the matched service may not be yet
present or were deleted, resulting in a NotFound exception.

2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry [-] Report handler unhealthy VIFHandler: kuryr_kubernetes.exceptions.K8sResourceNotFound: Resource not found: '{"kind":"Status","apiVersion":"v1","
metadata":{},"status":"Failure","message":"endpoints \\"svc-server\\" not found","reason":"NotFound","details":{"name":"svc-server","kind":"endpoints"},"code":404}\n'
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry Traceback (most recent call last):
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/handlers/retry.py", line 78, in __call__
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry self._handler(event)
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/handlers/k8s_base.py", line 84, in __call__
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry self.on_present(obj)
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/handlers/vif.py", line 70, in on_present
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry self.on_deleted(pod)
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/handlers/vif.py", line 212, in on_deleted
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry self._update_services(services, crd_pod_selectors, project_id)
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/handlers/vif.py", line 280, in _update_services
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry self._drv_lbaas.update_lbaas_sg(service, sgs)
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/lbaasv2.py", line 793, in update_lbaas_sg
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry endpoint = k8s.get(endpoints_link)
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/k8s_client.py", line 96, in get
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry self._raise_from_response(response)
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/k8s_client.py", line 81, in _raise_from_response
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry raise exc.K8sResourceNotFound(response.text)
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry kuryr_kubernetes.exceptions.K8sResourceNotFound: Resource not found: '{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message"
:"endpoints \\"svc-server\\" not found","reason":"NotFound","details":{"name":"svc-server","kind":"endpoints"},"code":404}\n'

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Run Network Policy tests with parallel of 3 on envs with Amphoras deployed
2.
3.

Actual results:


Expected results:


Additional info:

Comment 3 rlobillo 2020-06-15 15:58:25 UTC
Verified on OCP4.4.0-0.nightly-2020-06-14-142924 on OSP13 (2020-06-09.2) with OVS.

NP tests run with parallelism set to 3. The controller is not restarted and the backtrace is not observed.

Manual testing: after enabling DEBUG logs on kuryr-controller, run:

$ oc new-project test
$ oc run server --image=kuryr/demo
$ oc expose pod/server-1-6zbgh --port 80
$ oc apply -f np.yaml

  where:
  $ cat np.yaml 
  kind: NetworkPolicy
  apiVersion: networking.k8s.io/v1
  metadata:
   name: api-allow
  spec:
   podSelector:
     matchLabels:
       run: server
   ingress:
   - from:
       - podSelector:
           matchLabels:
             run: client

$ oc delete endpoints server-1-6zbgh
$ oc run client --image=kuryr/demo

Condition is hit:

[stack@undercloud-0 ~]$ oc logs -n openshift-kuryr kuryr-controller-78494d6fdd-s58g6 -f | grep 'Endpoint not Found. Skipping LB SG update for'

2020-06-15 15:51:14.299 1 DEBUG kuryr_kubernetes.controller.drivers.lbaasv2 [-] Endpoint not Found. Skipping LB SG update fortest/server-1-6zbgh as the LB resources are not present update_lbaas_sg /usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/lbaasv2.py:1051

No restart observed:

$ oc get pods -n openshift-kuryr
NAME                                   READY   STATUS    RESTARTS   AGE
kuryr-cni-2ms8v                        1/1     Running   0          3h9m
kuryr-cni-4nwmb                        1/1     Running   0          3h9m
kuryr-cni-fq8mm                        1/1     Running   0          3h38m
kuryr-cni-h8q2p                        1/1     Running   0          3h9m
kuryr-cni-kd24d                        1/1     Running   0          3h38m
kuryr-cni-lzphw                        1/1     Running   0          3h38m
kuryr-controller-78494d6fdd-s58g6      1/1     Running   0          11m
kuryr-dns-admission-controller-5fgwh   1/1     Running   0          3h38m
kuryr-dns-admission-controller-fdbts   1/1     Running   0          3h38m
kuryr-dns-admission-controller-p26rq   1/1     Running   0          3h38m

Comment 5 errata-xmlrpc 2020-06-23 00:57:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2580


Note You need to log in before you can comment on or make changes to this bug.