Bug 1872265 - [Kuryr] KuryrPort handler may cause pod to be removed
Summary: [Kuryr] KuryrPort handler may cause pod to be removed
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.6
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.6.0
Assignee: rdobosz
QA Contact: GenadiC
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-08-25 10:25 UTC by Jon Uriarte
Modified: 2020-10-27 16:33 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-27 16:33:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
conformance test result (790.87 KB, application/gzip)
2020-09-07 16:23 UTC, rlobillo
no flags Details
NP test results (189.08 KB, application/gzip)
2020-09-07 16:23 UTC, rlobillo
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift kuryr-kubernetes pull 331 0 None closed Bug 1872265: Remove right finalizer on pod absence. 2021-01-05 22:10:14 UTC
Github openshift kuryr-kubernetes pull 332 0 None closed Bug 1872265: Guard against manually removing of KuryrPort CRD. 2021-01-05 22:10:14 UTC
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:33:10 UTC

Description Jon Uriarte 2020-08-25 10:25:07 UTC
Description of problem:
Some kuryr cni pods and kuryr controller pod end up crashlooping while running OCP conformance tests due to some pods not being found.

2020-08-25 08:45:00.727 1 ERROR kuryr_kubernetes.controller.handlers.kuryrport [-] Failed to get pod: Resource not found: '{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods \\"pod-projected-configmaps-19f6cf54-cab1-404d-8f1f-13dd06da72e0\\" not found","reason":"NotFound","details":{"name":"pod-projected-configmaps-19f6cf54-cab1-404d-8f1f-13dd06da72e0","kind":"pods"},"code":404}\n': kuryr_kubernetes.exceptions.K8sResourceNotFound: Resource not found: '{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods 
\\"pod-projected-configmaps-19f6cf54-cab1-404d-8f1f-13dd06da72e0\\" not found","reason":"NotFound","details":{"name":"pod-projected-configmaps-19f6cf54-cab1-404d-8f1f-13dd06da72e0","kind":"pods"},"code":404}\n'                                                                                                                                                                            
2020-08-25 08:45:00.727 1 ERROR kuryr_kubernetes.controller.handlers.kuryrport Traceback (most recent call last):                                                                              
2020-08-25 08:45:00.727 1 ERROR kuryr_kubernetes.controller.handlers.kuryrport   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/handlers/kuryrport.py", line 136, in on_finalize                                                                                                                                                                                          
2020-08-25 08:45:00.727 1 ERROR kuryr_kubernetes.controller.handlers.kuryrport     pod = self.k8s.get(f"{constants.K8S_API_NAMESPACES}"                                                        
2020-08-25 08:45:00.727 1 ERROR kuryr_kubernetes.controller.handlers.kuryrport   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/k8s_client.py", line 104, in get                      
2020-08-25 08:45:00.727 1 ERROR kuryr_kubernetes.controller.handlers.kuryrport     self._raise_from_response(response)                                                                         
2020-08-25 08:45:00.727 1 ERROR kuryr_kubernetes.controller.handlers.kuryrport   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/k8s_client.py", line 83, in _raise_from_response      
2020-08-25 08:45:00.727 1 ERROR kuryr_kubernetes.controller.handlers.kuryrport     raise exc.K8sResourceNotFound(response.text)                                                                
2020-08-25 08:45:00.727 1 ERROR kuryr_kubernetes.controller.handlers.kuryrport kuryr_kubernetes.exceptions.K8sResourceNotFound: Resource not found: '{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods \\"pod-projected-configmaps-19f6cf54-cab1-404d-8f1f-13dd06da72e0\\" not found","reason":"NotFound","details":{"name":"pod-projected-configmaps-19f6cf54-cab1-404d-8f1f-13dd06da72e0","kind":"pods"},"code":404}\n'


2020-08-25 08:02:27.316 1 ERROR kuryr_kubernetes.controller.managers.health [-] Component KuryrPortHandler is dead.                                                                                                
2020-08-25 08:02:27.347 1 INFO kuryr_kubernetes.controller.service [-] Service 'KuryrK8sService' stopping                                                                                                          
2020-08-25 08:02:27.348 1 INFO kuryr_kubernetes.watcher [-] Stopped watching '/apis/openstack.org/v1/kuryrnetworkpolicies'                                                                                         2020-08-25 08:02:27.349 1 INFO kuryr_kubernetes.watcher [-] Stopped watching '/api/v1/namespaces'                                                                                                                  2020-08-25 08:02:27.351 1 INFO kuryr_kubernetes.watcher [-] Stopped watching '/api/v1/services'                                                                                                                    2020-08-25 08:02:27.352 1 INFO kuryr_kubernetes.watcher [-] Stopped watching '/api/v1/endpoints'                                                                                                                   2020-08-25 08:02:27.353 1 INFO kuryr_kubernetes.watcher [-] Stopped watching '/apis/networking.k8s.io/v1/networkpolicies'                                                                                          2020-08-25 08:02:27.354 1 INFO kuryr_kubernetes.watcher [-] Stopped watching '/api/v1/pods'                                                                                                                        
2020-08-25 08:02:27.356 1 INFO kuryr_kubernetes.watcher [-] Stopped watching '/apis/openstack.org/v1/kuryrloadbalancers'                                                                                          2020-08-25 08:02:27.357 1 INFO kuryr_kubernetes.watcher [-] Stopped watching '/apis/openstack.org/v1/kuryrnetworks'                                                                                                
2020-08-25 08:02:27.361 1 WARNING urllib3.connectionpool [-] Connection pool is full, discarding connection: api-int.ostest.shiftstack.com: queue.Full                                                             
2020-08-25 08:02:27.364 1 INFO kuryr_kubernetes.watcher [-] Stopped watching '/apis/openstack.org/v1/kuryrports'                                                                                                   
2020-08-25 08:02:27.364 1 INFO kuryr_kubernetes.watcher [-] No remaining active watchers, Exiting...                                                                                                               2020-08-25 08:02:27.366 1 INFO kuryr_kubernetes.controller.service [-] Service 'KuryrK8sService' stopping


Version-Release number of selected component (if applicable):
4.6.0-0.nightly-2020-08-24-034934
RHOS-16.1-RHEL-8-20200813.n.0

How reproducible: always when running conformance tests


Steps to Reproduce:
1. Install OCP 4.6 on OSP 16.1
2. Run conformance tests

Actual results: Kuryr controller and cni pods in crashloop

Expected results: no crashloop

Comment 5 rlobillo 2020-09-07 16:22:42 UTC
Verified on 4.6.0-0.nightly-2020-09-05-015624 over RHOS-16.1-RHEL-8-20200831.n.1

OCP installaed with IPI and NP and Conformance tests run with expected results.

kuryr-controller handled the target scenario and it is managed successfully:

$ oc logs -n openshift-kuryr kuryr-controller-7b6cdb86dd-wpx2x -p | grep 'Manual'
2020-09-07 13:01:31.196 1 WARNING kuryr_kubernetes.controller.handlers.kuryrport [-] Manually triggered KuryrPort taint-eviction-4 removal. This action should be avoided, since KuryrPort CRDs are internal to Kuryr.

No crashloopback observed neither on controllers nor cni pods.

Test logs attached.

Comment 6 rlobillo 2020-09-07 16:23:10 UTC
Created attachment 1713987 [details]
conformance test result

Comment 7 rlobillo 2020-09-07 16:23:28 UTC
Created attachment 1713988 [details]
NP test results

Comment 9 errata-xmlrpc 2020-10-27 16:33:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.