It may be possible that add_finalizer function first fails due to a Conflict updating the object as it got updated (in this case triggered the deletion) from the Kubernetes side, and then when trying to get the new object this is actually gone as the deletion was completed in between both actions. This leads to the next error: 2020-09-25 14:46:30.738 1 INFO kuryr_kubernetes.controller.drivers.vif_pool [-] Not enough time since the last pool update 2020-09-25 14:46:35.255 1 ERROR kuryr_kubernetes.controller.handlers.vif [-] Failed to add finalizer to pod object: Resource not found: '{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","messag e":"pods \\"certified-operators-84cvc\\" not found","reason":"NotFound","details":{"name":"certified-operators-84cvc","kind":"pods"},"code":404}\n': kuryr_kubernetes.exceptions.K8sResourceNotFound: Resource not f ound: '{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods \\"certified-operators-84cvc\\" not found","reason":"NotFound","details":{"name":"certified-operators-84cvc","kind":"pods "},"code":404}\n' 2020-09-25 14:46:35.255 1 ERROR kuryr_kubernetes.controller.handlers.vif Traceback (most recent call last): 2020-09-25 14:46:35.255 1 ERROR kuryr_kubernetes.controller.handlers.vif File "/usr/local/lib/python3.6/site-packages/kuryr_kubernetes/k8s_client.py", line 269, in add_finalizer 2020-09-25 14:46:35.255 1 ERROR kuryr_kubernetes.controller.handlers.vif self._raise_from_response(response) 2020-09-25 14:46:35.255 1 ERROR kuryr_kubernetes.controller.handlers.vif File "/usr/local/lib/python3.6/site-packages/kuryr_kubernetes/k8s_client.py", line 85, in _raise_from_response 2020-09-25 14:46:35.255 1 ERROR kuryr_kubernetes.controller.handlers.vif raise exc.K8sConflict(response.text) 2020-09-25 14:46:35.255 1 ERROR kuryr_kubernetes.controller.handlers.vif kuryr_kubernetes.exceptions.K8sConflict: Conflict: '{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Operatio n cannot be fulfilled on pods \\"certified-operators-84cvc\\": the object has been modified; please apply your changes to the latest version and try again","reason":"Conflict","details":{"name":"certified-operato rs-84cvc","kind":"pods"},"code":409}\n' 2020-09-25 14:46:35.255 1 ERROR kuryr_kubernetes.controller.handlers.vif 2020-09-25 14:46:35.255 1 ERROR kuryr_kubernetes.controller.handlers.vif During handling of the above exception, another exception occurred: 2020-09-25 14:46:35.255 1 ERROR kuryr_kubernetes.controller.handlers.vif 2020-09-25 14:46:35.255 1 ERROR kuryr_kubernetes.controller.handlers.vif Traceback (most recent call last): 2020-09-25 14:46:35.255 1 ERROR kuryr_kubernetes.controller.handlers.vif File "/usr/local/lib/python3.6/site-packages/kuryr_kubernetes/controller/handlers/vif.py", line 80, in on_present 2020-09-25 14:46:35.255 1 ERROR kuryr_kubernetes.controller.handlers.vif if not k8s.add_finalizer(pod, constants.POD_FINALIZER): 2020-09-25 14:46:35.255 1 ERROR kuryr_kubernetes.controller.handlers.vif File "/usr/local/lib/python3.6/site-packages/kuryr_kubernetes/k8s_client.py", line 274, in add_finalizer 2020-09-25 14:46:35.255 1 ERROR kuryr_kubernetes.controller.handlers.vif obj = self.get(path) 2020-09-25 14:46:35.255 1 ERROR kuryr_kubernetes.controller.handlers.vif File "/usr/local/lib/python3.6/site-packages/kuryr_kubernetes/k8s_client.py", line 112, in get 2020-09-25 14:46:35.255 1 ERROR kuryr_kubernetes.controller.handlers.vif self._raise_from_response(response) 2020-09-25 14:46:35.255 1 ERROR kuryr_kubernetes.controller.handlers.vif File "/usr/local/lib/python3.6/site-packages/kuryr_kubernetes/k8s_client.py", line 83, in _raise_from_response 2020-09-25 14:46:35.255 1 ERROR kuryr_kubernetes.controller.handlers.vif raise exc.K8sResourceNotFound(response.text) 2020-09-25 14:46:35.255 1 ERROR kuryr_kubernetes.controller.handlers.vif kuryr_kubernetes.exceptions.K8sResourceNotFound: Resource not found: '{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure"," message":"pods \\"certified-operators-84cvc\\" not found","reason":"NotFound","details":{"name":"certified-operators-84cvc","kind":"pods"},"code":404}\n' 2020-09-25 14:46:35.255 1 ERROR kuryr_kubernetes.controller.handlers.vif 2020-09-25 14:46:35.258 1 ERROR kuryr_kubernetes.handlers.retry [-] Report handler unhealthy VIFHandler: kuryr_kubernetes.exceptions.K8sResourceNotFound: Resource not found: '{"kind":"Status","apiVersion":"v1","m etadata":{},"status":"Failure","message":"pods \\"certified-operators-84cvc\\" not found","reason":"NotFound","details":{"name":"certified-operators-84cvc","kind":"pods"},"code":404}\n'
Verified on 4.6.0-0.nightly-2020-09-28-212756 over OSP over RHOS-16.1-RHEL-8-20200917.n.3 with OVN-Octavia. Running below scripts: # Script1.sh for i in `seq 100`; do echo $i; oc --image kuryr/demo run demo sleep 1 oc delete pod/demo & sleep 1 done # Script2.sh for i in `seq 100`; do echo $i; oc --image kuryr/demo run demo oc delete pod/demo & sleep 1 done Does not imply any restart on the kuryr-controller. Confirmed also that the fix is in place: $ oc rsh -n openshift-kuryr kuryr-controller-5c5979849f-ck5qs sed -n '274,278p' /usr/lib/python3.6/site-packages/kuryr_kubernetes/k8s_client.py try: obj = self.get(path) except exc.K8sResourceNotFound: # Object got removed before finalizer was set return False
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196