Bug 1895332

Summary: NP CRD unable to be patched because of missing sg rule ID
Product: OpenShift Container Platform Reporter: OpenShift BugZilla Robot <openshift-bugzilla-robot>
Component: NetworkingAssignee: Maysa Macedo <mdemaced>
Networking sub component: kuryr QA Contact: GenadiC <gcheresh>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: medium CC: bbennett, gcheresh, juriarte, ltomasbo, mdulko, rlobillo
Version: 4.5Keywords: Reopened
Target Milestone: ---   
Target Release: 4.4.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-02-03 10:11:43 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1893996    
Bug Blocks:    
Attachments:
Description Flags
NP test results with the fix
none
tempest results with the fix none

Comment 1 Maysa Macedo 2020-11-20 09:06:49 UTC
Only bug with severity high/urgent are being merged on 4.4. If this bug turn out be have that severity we can re-open it.

Comment 2 MichaƂ Dulko 2020-12-21 15:01:11 UTC
This was a complicated set of coincidences found when debugging NP tests failures that QE were seeing. We need this patch to fix that in 4.4, so I'm raising the severity of this one. This is because this can potentially cause kuryr-controller to restart constantly when a specific NP is existing on the system, effectively preventing it from doing anything and causing a periodic denial of service during crashloops. The only workaround would be to remove that network policy. IMO this does fulfill the "blocking functionality from succeeding" bar that "high" severity has.

Comment 4 rlobillo 2021-01-11 13:33:43 UTC
Verified on OCP4.4.0-0.nightly-2021-01-09-151918 on OSP16.1 with OVN-Octavia (RHOS-16.1-RHEL-8-20201214.n.3) with UPI installation.

CI job passed successfully: https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/DFG/view/osasinfra/view/shiftstack_ci/job/DFG-osasinfra-shiftstack_ci-ocp_verification-osp16.1-ocp4.4-upi/11/

All NP passed without restarts:

# Kuryr pods before running NP tests - ANSIBLE MANAGED BLOCK
NAME                               READY   STATUS    RESTARTS   AGE
kuryr-cni-47td5                    1/1     Running   0          80m
kuryr-cni-4vvz9                    1/1     Running   0          78m
kuryr-cni-7vcwg                    1/1     Running   0          79m
kuryr-cni-kmbgs                    1/1     Running   0          77m
kuryr-cni-pzmkw                    1/1     Running   0          79m
kuryr-cni-t9kh2                    1/1     Running   0          81m
kuryr-controller-5d46cb9b5-zlc8j   1/1     Running   0          43m
# END ANSIBLE MANAGED BLOCK
# Kuryr pods after running NP tests - ANSIBLE MANAGED BLOCK
NAME                               READY   STATUS    RESTARTS   AGE
kuryr-cni-47td5                    1/1     Running   0          3h
kuryr-cni-4vvz9                    1/1     Running   0          178m
kuryr-cni-7vcwg                    1/1     Running   0          179m
kuryr-cni-kmbgs                    1/1     Running   0          177m
kuryr-cni-pzmkw                    1/1     Running   0          179m
kuryr-cni-t9kh2                    1/1     Running   0          3h1m
kuryr-controller-5d46cb9b5-zlc8j   1/1     Running   0          144m
# END ANSIBLE MANAGED BLOCK

All tempest tests passed. Attaching test results.

Comment 5 rlobillo 2021-01-11 13:33:45 UTC
Created attachment 1746260 [details]
NP test results with the fix

Comment 6 rlobillo 2021-01-11 13:34:07 UTC
Created attachment 1746261 [details]
tempest results with the fix

Comment 9 errata-xmlrpc 2021-02-03 10:11:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.4.33 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:0281