Bug 2113860

Summary: After node re-created, some ovn annotations are not found for the node and due to that pod is in crashloop
Product: OpenShift Container Platform Reporter: Miguel Duarte Barroso <mduarted>
Component: NetworkingAssignee: Miguel Duarte Barroso <mduarted>
Networking sub component: ovn-kubernetes QA Contact: Anurag saxena <anusaxen>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: medium CC: abeekhof, anusaxen, ffernand, jcaamano, mduarted, mshitrit, msluiter, prabinov, rravaiol, surya, trozet, zzhao
Version: 4.10   
Target Milestone: ---   
Target Release: 4.11.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 2068910 Environment:
Last Closed: 2022-08-29 06:46:55 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2068910    
Bug Blocks: 2113861    

Comment 2 zhaozhanqi 2022-08-19 01:09:55 UTC
@Polina Rabinovich Could you again help verified this bug on 4.11 version since you did 4.12 version? thanks.

Comment 3 Polina Rabinovich 2022-08-23 05:01:37 UTC
yes, sure

Comment 4 Polina Rabinovich 2022-08-23 10:01:19 UTC
Verified in 4.11.0-0.nightly-2022-08-22-195828:
----------
[kni@provisionhost-0-0 ~]$ oc version

Client Version: 4.11.0-0.nightly-2022-08-22-195828
Kustomize Version: v4.5.4
Server Version: 4.11.0-0.nightly-2022-08-22-195828
Kubernetes Version: v1.24.0+b62823b
----------

I ran remediation process 6 times (using Node Deletion strategy) and all pods are Running:

[kni@provisionhost-0-0 ~]$ oc get pods -o wide -n openshift-operators
NAME                                                            READY   STATUS    RESTARTS      AGE     IP             NODE         NOMINATED NODE   READINESS GATES
node-healthcheck-operator-controller-manager-66c7648d44-xf88m   2/2     Running   0             53m     10.130.0.105   master-0-0   <none>           <none>
self-node-remediation-controller-manager-667dfb7f7f-ws626       1/1     Running   1 (52m ago)   53m     10.129.2.16    worker-0-2   <none>           <none>
self-node-remediation-ds-9b4qv                                  1/1     Running   0             52m     10.129.2.17    worker-0-2   <none>           <none>
self-node-remediation-ds-ktdtf                                  1/1     Running   0             52m     10.131.0.26    worker-0-1   <none>           <none>
self-node-remediation-ds-lfflf                                  1/1     Running   0             2m54s   10.128.2.3     worker-0-0   <none>           <none>


[kni@provisionhost-0-0 ~]$ oc get nodes
NAME         STATUS   ROLES    AGE     VERSION
master-0-0   Ready    master   4h      v1.24.0+b62823b
master-0-1   Ready    master   4h      v1.24.0+b62823b
master-0-2   Ready    master   4h      v1.24.0+b62823b
worker-0-0   Ready    worker   2m51s   v1.24.0+b62823b
worker-0-1   Ready    worker   3h38m   v1.24.0+b62823b
worker-0-2   Ready    worker   3h37m   v1.24.0+b62823b

Comment 7 errata-xmlrpc 2022-08-29 06:46:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.11.2 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:6143