2068910 – After node re-created, some ovn annotations are not found for the node and due to that pod is in crashloop

Bug 2068910 - After node re-created, some ovn annotations are not found for the node and due to that pod is in crashloop

Summary: After node re-created, some ovn annotations are not found for the node and du...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	4.10
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	4.12.0
Assignee:	Miguel Duarte Barroso
QA Contact:	Anurag saxena
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	2113860
TreeView+	depends on / blocked

Reported:	2022-03-27 13:20 UTC by Polina Rabinovich
Modified:	2023-01-17 19:48 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Clones:	2113860 2113861 (view as bug list)
Environment:
Last Closed:	2023-01-17 19:48:11 UTC
Target Upstream Version:
Embargoed:
Flags:	jcaamano: needinfo-

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	openshift ovn-kubernetes pull 1205	None	Merged	[DownstreamMerge] 4.12 initial merge from upstream: 7-18-22	2022-08-02 08:13:18 UTC
Github	ovn-org ovn-kubernetes pull 3039	None	Merged	Reconcile node lbs on node deletion	2022-07-05 08:59:37 UTC
Red Hat Product Errata	RHSA-2022:7399	None	None	None	2023-01-17 19:48:35 UTC

Description Polina Rabinovich 2022-03-27 13:20:17 UTC

Version-Release number:
---------------------------
[kni@provisionhost-0-0 ~]$ oc version

Client Version: 4.10.0-0.nightly-2022-03-23-153617
Server Version: 4.10.0-0.nightly-2022-03-23-153617
Kubernetes Version: v1.23.5+b0357ed

env:
-----------
disconnected ocp4.10 cluster, with OVNKubernetes network type, ipv4 baremetal network and ipv6 provisioning network. 


Description of problem:
-------------------------

As part of testing self-remediation based operator (poison-pill), a ds pod is placed on a node and constantly monitor its health.
we simulated a case in which the node become unhealthy by making "disk pressure" condition on that node.
once the node is detected as unhealthy, it is going through a process of remediation where eventually the node rebooted (with system reboot)
and return to the cluster as a healthy, running functional node.
in this case, we expect that also the poison-pill pod that is place on that node, will be functional and running. which does not happen if that cluster has 
OVNKubernetes network type. the pod is stucked in CrashLoopBackOff status, and also other pod that set on that node are in the same state:


[kni@provisionhost-0-0 ~]$ oc get po -A -o wide| grep -v Running | grep -v Completed
NAMESPACE                                          NAME                                                         READY   STATUS             RESTARTS         AGE     IP                NODE         NOMINATED NODE   READINESS GATES
openshift-dns                                      dns-default-985hd                                            1/2     CrashLoopBackOff   35 (2m13s ago)   174m    10.130.2.15       worker-0-0   <none>           <none>
openshift-operators                                poison-pill-ds-r62xf                                         0/1     CrashLoopBackOff   9 (33s ago)      76m     10.131.2.4        worker-0-0   <none>           <none>



[kni@provisionhost-0-0 ~]$ oc get po -o wide -n openshift-operators

NAME                                              READY   STATUS             RESTARTS        AGE     IP            NODE         NOMINATED NODE   READINESS GATES
poison-pill-controller-manager-84c85d56fb-67f29   1/1     Running            0               3h30m   10.131.0.40   worker-0-1   <none>           <none>
poison-pill-ds-jsctj                              1/1     Running            0               70m     10.128.2.17   worker-0-2   <none>           <none>
poison-pill-ds-r62xf                              0/1     CrashLoopBackOff   7 (4m47s ago)   70m     10.131.2.4    worker-0-0   <none>           <none>
poison-pill-ds-rzfzn                              1/1     Running            0               70m     10.131.0.53   worker-0-1   <none>           <none>




when looking into poison-pill-ds-r62xf pod description, we can see the following error:

Warning  ErrorAddingLogicalPort  64s               controlplane       unable to parse node L3 gw annotation: k8s.ovn.org/l3-gateway-config annotation not found for node "worker-0-0"
  Normal   AddedInterface          34s               multus             Add eth0 [10.131.2.4/23] from ovn-kubernetes

---------------------------

From the log of this pod:

2022-03-27T13:07:44.247Z	ERROR	controller-runtime.manager	Failed to get API Group-Resources	{"error": "Get \"https://172.30.0.1:443/api?timeout=32s\": dial tcp 172.30.0.1:443: connect: no route to host"}
github.com/go-logr/zapr.(*zapLogger).Error
	/remote-source/app/vendor/github.com/go-logr/zapr/zapr.go:132
sigs.k8s.io/controller-runtime/pkg/cluster.New
	/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/cluster/cluster.go:160
sigs.k8s.io/controller-runtime/pkg/manager.New
	/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/manager/manager.go:278
main.main
	/remote-source/app/main.go:96
runtime.main
	/opt/rh/go-toolset-1.16/root/usr/lib/go-toolset-1.16-golang/src/runtime/proc.go:225
2022-03-27T13:07:44.247Z	ERROR	setup	unable to start manager	{"error": "Get \"https://172.30.0.1:443/api?timeout=32s\": dial tcp 172.30.0.1:443: connect: no route to host"}
github.com/go-logr/zapr.(*zapLogger).Error



when we delete the pod, it return back to Running state.



How reproducible:
-------------------------
always



Steps to Reproduce:
-------------------------

1. install poison-pill operator
2. place large file on a node to simulate disk pressure (for example: fallocate -l 28G Gigfile) and wait for it to reboot and re-create
3. node return to ready state, the poison pill pod stucked in CrashLoopBackOff state. 


Actual results:
-------------------
the pod is stucked in CrashLoopBackOff status, and also other pod that set on that node are in the same state

Expected results:
-------------------
after reboot and re-creation of node instance, all the workloads in this worker are running.



Additional info:
-------------------
must-gather attached to the bug in the next comment

Comment 1 Polina Rabinovich 2022-03-27 13:50:12 UTC

must-gather - http://rhos-compute-node-10.lab.eng.rdu2.redhat.com/logs/BZ-2068910-must-gather

Comment 4 Polina Rabinovich 2022-05-23 06:15:54 UTC

ok, sure. I''l try to reproduce and I'll update you

Comment 8 Polina Rabinovich 2022-05-24 07:58:53 UTC

1. I do Disk pressure for the worker node
3. NHC detects it
2. Node becomes in not ready state
4. Created resource with the name poison pill remediation and this resource indicates that remediation is started for this node
5. Node rebooted (this process could occur several times)
6. And after that node will be deleted (this process could occur several times)
7. And after that node will be returned back , however pod in CrashLoopBackOff state

Comment 12 Polina Rabinovich 2022-05-26 09:59:41 UTC

|Could you confirm that the node is indeed added from a backup and what are the contents of this backup and if it contains all the previous annotations?

I checked the process again. I added label and annotation to the worker-0-0 before the remediation, however, after node re-creation this data didn't save.. So, I suppose we have a problem with the backup part (from our side).

--------------------

Before Node Deletion:
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=worker-0-0
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/worker=
                    node.openshift.io/os_id=rhcos
                    test-deletion=
Annotations:        k8s.ovn.org/host-addresses: ["192.168.123.139"]
                    k8s.ovn.org/l3-gateway-config:
                      {"default":{"mode":"shared","interface-id":"br-ex_worker-0-0","mac-address":"52:54:00:3e:2b:0e","ip-addresses":["192.168.123.139/24"],"ip-...
                    k8s.ovn.org/node-chassis-id: 9304749a-a518-4caf-b91f-f6b543746d5a
                    k8s.ovn.org/node-mgmt-port-mac-address: 2a:57:c9:9f:97:92
                    k8s.ovn.org/node-primary-ifaddr: {"ipv4":"192.168.123.139/24"}
                    k8s.ovn.org/node-subnets: {"default":"10.129.2.0/23"}
                    machine.openshift.io/machine: openshift-machine-api/ocp-edge-cluster-0-5mr87-worker-0-vn4vt
                    machineconfiguration.openshift.io/controlPlaneTopology: HighlyAvailable
                    machineconfiguration.openshift.io/currentConfig: rendered-worker-4ae896e1ae6f565724711c746bbc37f1
                    machineconfiguration.openshift.io/desiredConfig: rendered-worker-4ae896e1ae6f565724711c746bbc37f1
                    machineconfiguration.openshift.io/reason: 
                    machineconfiguration.openshift.io/state: Done
                    test/deletion: true
                    volumes.kubernetes.io/controller-managed-attach-detach: true

--------------------
After Node re-creating:
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=worker-0-0
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/worker=
                    node.openshift.io/os_id=rhcos
Annotations:        k8s.ovn.org/host-addresses: ["192.168.123.139","fd00:1101:0:1:5207:3236:7124:1e1"]
                    k8s.ovn.org/l3-gateway-config:
                      {"default":{"mode":"shared","interface-id":"br-ex_worker-0-0","mac-address":"52:54:00:3e:2b:0e","ip-addresses":["192.168.123.139/24"],"ip-...
                    k8s.ovn.org/node-chassis-id: 9304749a-a518-4caf-b91f-f6b543746d5a
                    k8s.ovn.org/node-mgmt-port-mac-address: 2a:57:c9:9f:97:92
                    k8s.ovn.org/node-primary-ifaddr: {"ipv4":"192.168.123.139/24"}
                    k8s.ovn.org/node-subnets: {"default":"10.130.2.0/23"}
                    machine.openshift.io/machine: openshift-machine-api/ocp-edge-cluster-0-5mr87-worker-0-vn4vt
                    machineconfiguration.openshift.io/controlPlaneTopology: HighlyAvailable
                    machineconfiguration.openshift.io/currentConfig: rendered-worker-4ae896e1ae6f565724711c746bbc37f1
                    machineconfiguration.openshift.io/desiredConfig: rendered-worker-4ae896e1ae6f565724711c746bbc37f1
                    machineconfiguration.openshift.io/reason: 
                    machineconfiguration.openshift.io/state: Done
                    volumes.kubernetes.io/controller-managed-attach-detach: true
As we can see label (test-deletion=) and annotation (test/deletion: true) didn't save.

Comment 14 Polina Rabinovich 2022-05-26 10:41:22 UTC

Before closing BZ, I'll check with OpenshiftSDN as well, to be sure that the problem on our side and doesn't relate to OVN

Comment 15 Polina Rabinovich 2022-05-26 11:01:16 UTC

I checked with OpenshiftSDN:

--------------------

Before Node Deletion:

Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=worker-0-0
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/worker=
                    node.openshift.io/os_id=rhcos
                    test/deletion=
Annotations:        is-reboot-capable.poison-pill.medik8s.io: true
                    machine.openshift.io/machine: openshift-machine-api/ocp-edge-cluster-0-xjdwg-worker-0-bflt9
                    machineconfiguration.openshift.io/controlPlaneTopology: HighlyAvailable
                    machineconfiguration.openshift.io/currentConfig: rendered-worker-33d59c6e83fda8461b8b44b86b53f155
                    machineconfiguration.openshift.io/desiredConfig: rendered-worker-33d59c6e83fda8461b8b44b86b53f155
                    machineconfiguration.openshift.io/reason: 
                    machineconfiguration.openshift.io/state: Done
                    test/Deletion: true
                    volumes.kubernetes.io/controller-managed-attach-detach: true

--------------------
After Node re-creating:

Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=worker-0-0
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/worker=
                    node.openshift.io/os_id=rhcos
                    test/deletion=
Annotations:        is-reboot-capable.poison-pill.medik8s.io: true
                    machine.openshift.io/machine: openshift-machine-api/ocp-edge-cluster-0-xjdwg-worker-0-bflt9
                    machineconfiguration.openshift.io/controlPlaneTopology: HighlyAvailable
                    machineconfiguration.openshift.io/currentConfig: rendered-worker-33d59c6e83fda8461b8b44b86b53f155
                    machineconfiguration.openshift.io/desiredConfig: rendered-worker-33d59c6e83fda8461b8b44b86b53f155
                    machineconfiguration.openshift.io/reason: 
                    machineconfiguration.openshift.io/state: Done
                    test/Deletion: true
                    volumes.kubernetes.io/controller-managed-attach-detach: true


As we can see label and annotation were saved.

Comment 17 Polina Rabinovich 2022-05-29 07:34:37 UTC

In a nutshell:

1. Detecting a faulty node NHC creates a  SNR remediation
2. SNR backups the node on the SNR remediation
3. SNR triggers a reboot either:
   - watchdog (first priority)
   - softdog (second priority)
   - system reboot (last priority)
4. After a period of time (in which reboot assumed to have completed) we delete the Node
5. We create the node from the backup

If you have more questions, feel free to write in the slack to me or Michael.

Comment 19 Polina Rabinovich 2022-06-07 08:09:30 UTC

I provide must-gather with the required information (poison pill related logs & CRs) - http://rhos-compute-node-10.lab.eng.rdu2.redhat.com/logs/BZ-2068910-must-gather-poison

[kni@provisionhost-0-0 ~]$ oc get pods -n openshift-operators -o wide
NAME                                                           READY   STATUS             RESTARTS         AGE   IP            NODE         NOMINATED NODE   READINESS GATES
node-healthcheck-operator-controller-manager-df4944675-f6c8x   2/2     Running            0                70m   10.130.1.39   master-0-1   <none>           <none>
poison-pill-controller-manager-6574647945-z9w9z                1/1     Running            0                60m   10.129.2.14   worker-0-2   <none>           <none>
poison-pill-ds-dklj9                                           0/1     CrashLoopBackOff   13 (4m40s ago)   54m   10.128.2.5    worker-0-0   <none>           <none>
poison-pill-ds-xr26v                                           1/1     Running            0                70m   10.129.2.13   worker-0-2   <none>           <none>
poison-pill-ds-zn6x8                                           1/1     Running            0                70m   10.131.0.62   worker-0-1   <none>           <none>

Comment 20 Miguel Duarte Barroso 2022-06-07 09:14:45 UTC

The must-gather provided in comment#19 unfortunately does not have the required CRs (the poisonpillremediation). 
It does feature other poison pill related CRs (poisonpillconfig / poisonpillremediationtemplate), which lead me to
believe the sought after CR was deleted already - or never created ?...

Would you upload the `poisonpillremediation` CR manually captured while the issue is happening ? @prabinov 

Thanks in advance.

Comment 23 Polina Rabinovich 2022-06-07 10:04:37 UTC

poison pill must-gather:
http://rhos-compute-node-10.lab.eng.rdu2.redhat.com/logs/BZ-2068910-pp-must-gather-during-remediation

regular must-gather (includes OVN-K logs):
http://rhos-compute-node-10.lab.eng.rdu2.redhat.com/logs/BZ-2068910-must-gather-during-remediation

Comment 24 Miguel Duarte Barroso 2022-06-07 10:33:32 UTC

From comment#6 I understand the issue could be fixed by deleting all the workloads running on the node - which will be reconciled, and re-created.

On comment#9, I understand there's a different remediation policy - `ResourceDeletion` - that will indeed remove the node's workloads instead of the
node itself.

Have you tried this self-remediation mechanism with that remediation policy @prabinov ? If so, did it work as expected ?

Comment 25 Polina Rabinovich 2022-06-07 10:36:49 UTC

Yes, I checked ResourceDeletion strategy, it works as expected.
--------
till 4.11 - NodeDeletion is a default remediation strategy, because of that, we wanted to support the OVN network type. Because of the issue we decided that it more reliable will be used the ResourceDeletion strategy as a default , however even though the NodeDeletion strategy doesn't work properly in OVN we still provide this strategy for the user (but from 4.11 isn't as a default strategy)

Comment 26 Polina Rabinovich 2022-06-08 20:25:11 UTC

must gather from the poison-pill stuff while the issue occurs. - http://rhos-compute-node-10.lab.eng.rdu2.redhat.com/logs/BZ-2068910-must-gather-pp-while-the-issue
must gather from the openshift namespaces after the issue occurs - http://rhos-compute-node-10.lab.eng.rdu2.redhat.com/logs/BZ-2068910-regular-must-gather-after-the-issue
must gather from the poison-pill stuff after the issue occurs - http://rhos-compute-node-10.lab.eng.rdu2.redhat.com/logs/BZ-2068910-must-gather-pp-after-the-issue

After the remediation, node (worker-0-0) was deleted (you can see tha age 5m59s ):

[kni@provisionhost-0-0 ~]$ oc get nodes
NAME         STATUS   ROLES    AGE     VERSION
master-0-0   Ready    master   7h15m   v1.23.5+3afdacb
master-0-1   Ready    master   7h15m   v1.23.5+3afdacb
master-0-2   Ready    master   7h16m   v1.23.5+3afdacb
worker-0-0   Ready    worker   5m59s   v1.23.5+3afdacb
worker-0-1   Ready    worker   6h51m   v1.23.5+3afdacb
worker-0-2   Ready    worker   6h51m   v1.23.5+3afdacb

And node was rebooted (you can see by uptime):

[core@worker-0-0 ~]$ uptime
 20:01:28 up 10 min,  1 user,  load average: 0.11, 0.29, 0.24

Comment 27 Miguel Duarte Barroso 2022-06-15 12:46:13 UTC

An update, to re-scope the bug; it can be seen in the provided must-gathers the recovered node *features* the correct annotations (i.e. the node subnet annotation is the correct one). 
This data can be seen in [0]. It can be seen the node subnet annotation features the same CIDR both before and after the node delete / re-add.

The issue is that the re-created node logical switches *do not* feature the node load balancers created for some services- notably, the kubernetes service [1].
This, in turn, causes the poison-pill pod on that node to crash-loop [2]. 

Reconciling these load balancers on node deletion (removing them) would force re-creation when the node is added.
This is important because the node logical switch is only associated with the node load balancers *when* the service controller re-creates the node load balancers for a service [3].

Does this seem the right thing to do ?

[0] - https://gist.github.com/maiqueb/feead4bfe72f2bd1e9bb9b1eab915ee9#node-after-reboot
[1] - https://gist.github.com/maiqueb/feead4bfe72f2bd1e9bb9b1eab915ee9#check-the-logical-switches-load-balancers
[2] - https://gist.github.com/maiqueb/feead4bfe72f2bd1e9bb9b1eab915ee9#no-connection-to-the-k8s-api
[3] - https://github.com/ovn-org/ovn-kubernetes/blob/0573fe590a6f307200dc61a9cd0a6409db754c3d/go-controller/pkg/ovn/loadbalancer/loadbalancer.go#L118

Comment 28 Michael Shitrit 2022-06-23 11:36:45 UTC

We are soon planing to deprecate and remove Node Deletion strategy, therefore I think this bug can be closed - since soon it'll be irrelevant.

Comment 29 Miguel Duarte Barroso 2022-07-05 09:18:28 UTC

(In reply to Michael Shitrit from comment #28)
> We are soon planing to deprecate and remove Node Deletion strategy,
> therefore I think this bug can be closed - since soon it'll be irrelevant.

We have just recently merged code upstream that addresses this bug. 

Up to you - we can close it, or merge it into 4.12 (and optionally back-port it to the required releases). @mshitrit

Comment 30 Michael Shitrit 2022-07-06 06:17:04 UTC

Thanks !

I think that if you merge it into 4.12 it will be great.
At the moment I'm not sure about back-porting, but we're defiantly glad that this is an option.

Comment 31 Miguel Duarte Barroso 2022-07-19 12:48:25 UTC

Downstream PR (not yet merged) is seen on https://github.com/openshift/ovn-kubernetes/pull/1205

Comment 34 Polina Rabinovich 2022-08-01 08:09:26 UTC

I simulated remediation process one after another. And I see that the pod/pods still in CrashLoopBackOff..

[kni@provisionhost-0-0 ~]$ oc get pods -n openshift-operators -o wide
NAME                                                            READY   STATUS             RESTARTS      AGE     IP             NODE         NOMINATED NODE   READINESS GATES
node-healthcheck-operator-controller-manager-564685c55f-s9v4x   2/2     Running            0             88m     10.128.0.42    master-0-1   <none>           <none>
poison-pill-controller-manager-7cd87f55-8b644                   2/2     Running            0             17m     10.131.1.135   worker-0-2   <none>           <none>
poison-pill-ds-cs96l                                            1/1     Running            0             15m     10.131.1.137   worker-0-2   <none>           <none>
poison-pill-ds-v9gz8                                            0/1     CrashLoopBackOff   4 (91s ago)   5m38s   10.129.2.9     worker-0-0   <none>           <none>
poison-pill-ds-xqz2x                                            0/1     CrashLoopBackOff   3 (32s ago)   3m42s   10.128.2.4     worker-0-1   <none>           <none>


$ oc describe pod poison-pill-ds-v9gz8 -n openshift-operators

Events:
  Type     Reason          Age                     From               Message
  ----     ------          ----                    ----               -------
  Normal   Scheduled       8m5s                    default-scheduler  Successfully assigned openshift-operators/poison-pill-ds-v9gz8 to worker-0-0 by master-0-2
  Normal   AddedInterface  8m4s                    multus             Add eth0 [10.129.2.9/23] from ovn-kubernetes
  Normal   Pulled          8m2s                    kubelet            Successfully pulled image "quay.io/medik8s/poison-pill-operator:0.1.4" in 2.148536905s
  Normal   Pulled          7m29s                   kubelet            Successfully pulled image "quay.io/medik8s/poison-pill-operator:0.1.4" in 1.757707605s
  Normal   Pulled          6m44s                   kubelet            Successfully pulled image "quay.io/medik8s/poison-pill-operator:0.1.4" in 1.746327879s
  Normal   Created         5m50s (x4 over 8m2s)    kubelet            Created container manager
  Normal   Started         5m50s (x4 over 8m2s)    kubelet            Started container manager
  Normal   Pulled          5m50s                   kubelet            Successfully pulled image "quay.io/medik8s/poison-pill-operator:0.1.4" in 1.809316164s
  Normal   Pulling         4m31s (x5 over 8m4s)    kubelet            Pulling image "quay.io/medik8s/poison-pill-operator:0.1.4"
  Normal   Pulled          4m29s                   kubelet            Successfully pulled image "quay.io/medik8s/poison-pill-operator:0.1.4" in 1.868824057s
  Warning  BackOff         2m54s (x13 over 6m59s)  kubelet            Back-off restarting failed container

Comment 35 Miguel Duarte Barroso 2022-08-01 09:36:27 UTC

Please note the bug was *not* back-ported into 4.11 / 4.10 - as per #comment33, we're waiting on feedback to understand if we should backport or not.

To verify the bug, you'll need to check on 4.12 - the fix landed on 4.12.0-0.nightly-2022-07-22-194831; any build *after that one* will feature the fix we want to get feedback from.

Comment 37 Polina Rabinovich 2022-08-01 15:25:05 UTC

After discussion with Michael, we decided that we want back-port to 4.11 & 4.10. We still support the NodeDeletion strategy, so this fix would help us. If it's not difficult for you and without any risk, we definitely want the back-port. Thanks for your help!

@mduarted

Comment 38 Miguel Duarte Barroso 2022-08-05 07:40:52 UTC

(In reply to Polina Rabinovich from comment #37)
> After discussion with Michael, we decided that we want back-port to 4.11 &
> 4.10. We still support the NodeDeletion strategy, so this fix would help us.
> If it's not difficult for you and without any risk, we definitely want the
> back-port. Thanks for your help!
> 
> @mduarted

Created bugs and PRs to backport this into 4.11 and 4.10, as requested.

The bugs are: 
  - https://bugzilla.redhat.com/show_bug.cgi?id=2113860
  - https://bugzilla.redhat.com/show_bug.cgi?id=2113861

Currently they're blocked because they do not have enough priority to be merged into 4.11 after the feature freeze, 
meaning we need to wait until 4.11 GAs, and only then can we finish the back-ports.

Comment 41 errata-xmlrpc 2023-01-17 19:48:11 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:7399

Note You need to log in before you can comment on or make changes to this bug.