Bug 1883825

Summary: Kubernetes exceptions are not properly handled
Product: OpenShift Container Platform Reporter: Maysa Macedo <mdemaced>
Component: NetworkingAssignee: Maysa Macedo <mdemaced>
Networking sub component: kuryr QA Contact: GenadiC <gcheresh>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: unspecified CC: rlobillo
Version: 4.6   
Target Milestone: ---   
Target Release: 4.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-27 16:46:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1883166    
Bug Blocks:    
Attachments:
Description Flags
NP test results
none
kuryr-controller logs none

Description Maysa Macedo 2020-09-30 10:48:30 UTC
Description of problem:

Exceptions raised due to a CR not found or Kubernetes API not being available should be properly handled to either be skipped or retried, and consequently avoid a Kuryr-controller restart.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Install OCP 
2. Run network policy tests
3.

Actual results:


Expected results:


Additional info:

Comment 2 rlobillo 2020-10-01 07:30:57 UTC
Verification blocked until https://bugzilla.redhat.com/show_bug.cgi?id=1883419 is resolved.

Comment 4 rlobillo 2020-10-02 11:38:26 UTC
Verified on OCP4.6.0-0.nightly-2020-10-02-001427 over OSP16.1 (RHOS-16.1-RHEL-8-20200917.n.3) with OVN-Octavia.

NP run successfully without restarts:

# Kuryr pods before running NP tests - ANSIBLE MANAGED BLOCK
NAME                              READY   STATUS    RESTARTS   AGE
kuryr-cni-89pm9                   1/1     Running   1          35m
kuryr-cni-jfltp                   1/1     Running   4          61m
kuryr-cni-k4j95                   1/1     Running   0          61m
kuryr-cni-l87vw                   1/1     Running   0          35m
kuryr-cni-zhvzc                   1/1     Running   0          35m
kuryr-cni-zpmfv                   1/1     Running   0          61m
kuryr-controller-775ff4bb-bgpml   1/1     Running   1          61m
# END ANSIBLE MANAGED BLOCK
# Kuryr pods after running NP tests - ANSIBLE MANAGED BLOCK
NAME                              READY   STATUS    RESTARTS   AGE
kuryr-cni-89pm9                   1/1     Running   1          138m
kuryr-cni-jfltp                   1/1     Running   4          164m
kuryr-cni-k4j95                   1/1     Running   0          164m
kuryr-cni-l87vw                   1/1     Running   0          138m
kuryr-cni-zhvzc                   1/1     Running   0          138m
kuryr-cni-zpmfv                   1/1     Running   0          164m
kuryr-controller-775ff4bb-bgpml   1/1     Running   1          164m
# END ANSIBLE MANAGED BLOCK

Kuryr controller logs and NP results attached.

Comment 5 rlobillo 2020-10-02 11:39:03 UTC
Created attachment 1718411 [details]
NP test results

Comment 6 rlobillo 2020-10-02 11:39:29 UTC
Created attachment 1718412 [details]
kuryr-controller logs

Comment 9 errata-xmlrpc 2020-10-27 16:46:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196