Bug 1838985 - [Kuryr] LB sg update not skipped when no endpoint is found
Summary: [Kuryr] LB sg update not skipped when no endpoint is found
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.5
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.5.0
Assignee: Maysa Macedo
QA Contact: GenadiC
URL:
Whiteboard:
Depends On:
Blocks: 1839023
TreeView+ depends on / blocked
 
Reported: 2020-05-22 09:02 UTC by Maysa Macedo
Modified: 2020-07-13 17:41 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 1839023 (view as bug list)
Environment:
Last Closed: 2020-07-13 17:41:07 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift kuryr-kubernetes pull 240 0 None closed Bug 1838985: Skip LB sg update when no endpoint is found 2020-06-23 06:21:52 UTC
Red Hat Product Errata RHBA-2020:2409 0 None None None 2020-07-13 17:41:27 UTC

Description Maysa Macedo 2020-05-22 09:02:20 UTC
Description of problem:

When a pod event is handled it's possible that a Network Policy and Service
are affected by that pod and the LB sg of selected services needs to be
updated. However, the endpoints for the matched service may not be yet
present or were deleted, resulting in a NotFound exception.

2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry [-] Report handler unhealthy VIFHandler: kuryr_kubernetes.exceptions.K8sResourceNotFound: Resource not found: '{"kind":"Status","apiVersion":"v1","
metadata":{},"status":"Failure","message":"endpoints \\"svc-server\\" not found","reason":"NotFound","details":{"name":"svc-server","kind":"endpoints"},"code":404}\n'
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry Traceback (most recent call last):
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/handlers/retry.py", line 78, in __call__
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry self._handler(event)
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/handlers/k8s_base.py", line 84, in __call__
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry self.on_present(obj)
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/handlers/vif.py", line 70, in on_present
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry self.on_deleted(pod)
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/handlers/vif.py", line 212, in on_deleted
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry self._update_services(services, crd_pod_selectors, project_id)
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/handlers/vif.py", line 280, in _update_services
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry self._drv_lbaas.update_lbaas_sg(service, sgs)
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/lbaasv2.py", line 793, in update_lbaas_sg
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry endpoint = k8s.get(endpoints_link)
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/k8s_client.py", line 96, in get
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry self._raise_from_response(response)
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/k8s_client.py", line 81, in _raise_from_response
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry raise exc.K8sResourceNotFound(response.text)
2020-04-30 15:03:46.792 1 ERROR kuryr_kubernetes.handlers.retry kuryr_kubernetes.exceptions.K8sResourceNotFound: Resource not found: '{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message"
:"endpoints \\"svc-server\\" not found","reason":"NotFound","details":{"name":"svc-server","kind":"endpoints"},"code":404}\n'

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Run Network Policy tests with parallel of 3 on envs with Amphoras deployed
2.
3.

Actual results:


Expected results:


Additional info:

Comment 3 rlobillo 2020-06-02 08:51:51 UTC
Verified on OSP13 - 2020-05-19.2 and 4.5.0-0.nightly-2020-05-29-005153

NP tests run with parallelism set to 3. The controller is not restarted and backtrace is not observed.

Furthermore, manual reproduction of the issue was performed to confirm stability:

1. Enable debug logs:

oc scale deployment -n openshift-cluster-version cluster-version-operator --replicas 0
oc edit cm kuryr-config -n openshift-kuryr

2. Create pod, service and network policy.

oc run server --image=kuryr/demo
oc expose pod/server --port 80
oc apply -f np.yml

where np.yaml contains:

kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
 name: api-allow
spec:
 podSelector:
   matchLabels:
     run: server
 ingress:
 - from:
     - podSelector:
         matchLabels:
           run: client

3. Delete the endpoints of the created svc: oc delete endpoints server
4. Created another pod which will trigger lb update:
 
oc run client --image=kuryr/demo) and check controller logs during it: 

oc logs -n openshift-kuryr kuryr-controller-5d8dc4f6c5-t82vk -f | grep 'Endpoint not Found. Skipping LB SG update for'

Result: 

- No controller crashes.
- DEBUG log line is written:

[stack@undercloud-0 ~]$ oc logs -n openshift-kuryr kuryr-controller-5d8dc4f6c5-t82vk -f | grep 'Endpoint not Found. Skipping LB SG update for'
2020-06-01 10:34:31.931 1 DEBUG kuryr_kubernetes.controller.drivers.lbaasv2 [-] Endpoint not Found. Skipping LB SG update fortest/server as the LB resources are not present update_lbaas_sg /usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/lbaasv2.py:808

Comment 4 errata-xmlrpc 2020-07-13 17:41:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409


Note You need to log in before you can comment on or make changes to this bug.