Bug 1872265 - [Kuryr] KuryrPort handler may cause pod to be removed
Summary: [Kuryr] KuryrPort handler may cause pod to be removed
Keywords:
Status: VERIFIED
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.6
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.6.0
Assignee: rdobosz
QA Contact: GenadiC
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-08-25 10:25 UTC by Jon Uriarte
Modified: 2020-09-07 16:23 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)
conformance test result (790.87 KB, application/gzip)
2020-09-07 16:23 UTC, rlobillo
no flags Details
NP test results (189.08 KB, application/gzip)
2020-09-07 16:23 UTC, rlobillo
no flags Details


Links
System ID Priority Status Summary Last Updated
Github openshift kuryr-kubernetes pull 331 None closed Bug 1872265: Remove right finalizer on pod absence. 2020-09-07 13:48:03 UTC
Github openshift kuryr-kubernetes pull 332 None closed Bug 1872265: Guard against manually removing of KuryrPort CRD. 2020-09-07 13:48:02 UTC

Description Jon Uriarte 2020-08-25 10:25:07 UTC
Description of problem:
Some kuryr cni pods and kuryr controller pod end up crashlooping while running OCP conformance tests due to some pods not being found.

2020-08-25 08:45:00.727 1 ERROR kuryr_kubernetes.controller.handlers.kuryrport [-] Failed to get pod: Resource not found: '{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods \\"pod-projected-configmaps-19f6cf54-cab1-404d-8f1f-13dd06da72e0\\" not found","reason":"NotFound","details":{"name":"pod-projected-configmaps-19f6cf54-cab1-404d-8f1f-13dd06da72e0","kind":"pods"},"code":404}\n': kuryr_kubernetes.exceptions.K8sResourceNotFound: Resource not found: '{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods 
\\"pod-projected-configmaps-19f6cf54-cab1-404d-8f1f-13dd06da72e0\\" not found","reason":"NotFound","details":{"name":"pod-projected-configmaps-19f6cf54-cab1-404d-8f1f-13dd06da72e0","kind":"pods"},"code":404}\n'                                                                                                                                                                            
2020-08-25 08:45:00.727 1 ERROR kuryr_kubernetes.controller.handlers.kuryrport Traceback (most recent call last):                                                                              
2020-08-25 08:45:00.727 1 ERROR kuryr_kubernetes.controller.handlers.kuryrport   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/handlers/kuryrport.py", line 136, in on_finalize                                                                                                                                                                                          
2020-08-25 08:45:00.727 1 ERROR kuryr_kubernetes.controller.handlers.kuryrport     pod = self.k8s.get(f"{constants.K8S_API_NAMESPACES}"                                                        
2020-08-25 08:45:00.727 1 ERROR kuryr_kubernetes.controller.handlers.kuryrport   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/k8s_client.py", line 104, in get                      
2020-08-25 08:45:00.727 1 ERROR kuryr_kubernetes.controller.handlers.kuryrport     self._raise_from_response(response)                                                                         
2020-08-25 08:45:00.727 1 ERROR kuryr_kubernetes.controller.handlers.kuryrport   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/k8s_client.py", line 83, in _raise_from_response      
2020-08-25 08:45:00.727 1 ERROR kuryr_kubernetes.controller.handlers.kuryrport     raise exc.K8sResourceNotFound(response.text)                                                                
2020-08-25 08:45:00.727 1 ERROR kuryr_kubernetes.controller.handlers.kuryrport kuryr_kubernetes.exceptions.K8sResourceNotFound: Resource not found: '{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods \\"pod-projected-configmaps-19f6cf54-cab1-404d-8f1f-13dd06da72e0\\" not found","reason":"NotFound","details":{"name":"pod-projected-configmaps-19f6cf54-cab1-404d-8f1f-13dd06da72e0","kind":"pods"},"code":404}\n'


2020-08-25 08:02:27.316 1 ERROR kuryr_kubernetes.controller.managers.health [-] Component KuryrPortHandler is dead.                                                                                                
2020-08-25 08:02:27.347 1 INFO kuryr_kubernetes.controller.service [-] Service 'KuryrK8sService' stopping                                                                                                          
2020-08-25 08:02:27.348 1 INFO kuryr_kubernetes.watcher [-] Stopped watching '/apis/openstack.org/v1/kuryrnetworkpolicies'                                                                                         2020-08-25 08:02:27.349 1 INFO kuryr_kubernetes.watcher [-] Stopped watching '/api/v1/namespaces'                                                                                                                  2020-08-25 08:02:27.351 1 INFO kuryr_kubernetes.watcher [-] Stopped watching '/api/v1/services'                                                                                                                    2020-08-25 08:02:27.352 1 INFO kuryr_kubernetes.watcher [-] Stopped watching '/api/v1/endpoints'                                                                                                                   2020-08-25 08:02:27.353 1 INFO kuryr_kubernetes.watcher [-] Stopped watching '/apis/networking.k8s.io/v1/networkpolicies'                                                                                          2020-08-25 08:02:27.354 1 INFO kuryr_kubernetes.watcher [-] Stopped watching '/api/v1/pods'                                                                                                                        
2020-08-25 08:02:27.356 1 INFO kuryr_kubernetes.watcher [-] Stopped watching '/apis/openstack.org/v1/kuryrloadbalancers'                                                                                          2020-08-25 08:02:27.357 1 INFO kuryr_kubernetes.watcher [-] Stopped watching '/apis/openstack.org/v1/kuryrnetworks'                                                                                                
2020-08-25 08:02:27.361 1 WARNING urllib3.connectionpool [-] Connection pool is full, discarding connection: api-int.ostest.shiftstack.com: queue.Full                                                             
2020-08-25 08:02:27.364 1 INFO kuryr_kubernetes.watcher [-] Stopped watching '/apis/openstack.org/v1/kuryrports'                                                                                                   
2020-08-25 08:02:27.364 1 INFO kuryr_kubernetes.watcher [-] No remaining active watchers, Exiting...                                                                                                               2020-08-25 08:02:27.366 1 INFO kuryr_kubernetes.controller.service [-] Service 'KuryrK8sService' stopping


Version-Release number of selected component (if applicable):
4.6.0-0.nightly-2020-08-24-034934
RHOS-16.1-RHEL-8-20200813.n.0

How reproducible: always when running conformance tests


Steps to Reproduce:
1. Install OCP 4.6 on OSP 16.1
2. Run conformance tests

Actual results: Kuryr controller and cni pods in crashloop

Expected results: no crashloop

Comment 5 rlobillo 2020-09-07 16:22:42 UTC
Verified on 4.6.0-0.nightly-2020-09-05-015624 over RHOS-16.1-RHEL-8-20200831.n.1

OCP installaed with IPI and NP and Conformance tests run with expected results.

kuryr-controller handled the target scenario and it is managed successfully:

$ oc logs -n openshift-kuryr kuryr-controller-7b6cdb86dd-wpx2x -p | grep 'Manual'
2020-09-07 13:01:31.196 1 WARNING kuryr_kubernetes.controller.handlers.kuryrport [-] Manually triggered KuryrPort taint-eviction-4 removal. This action should be avoided, since KuryrPort CRDs are internal to Kuryr.

No crashloopback observed neither on controllers nor cni pods.

Test logs attached.

Comment 6 rlobillo 2020-09-07 16:23:10 UTC
Created attachment 1713987 [details]
conformance test result

Comment 7 rlobillo 2020-09-07 16:23:28 UTC
Created attachment 1713988 [details]
NP test results


Note You need to log in before you can comment on or make changes to this bug.