This BZ is cloned from the Ceph BZ with the following proposal... Option 1 If the Ceph BZ can be fixed in time for 4.2z1: - Rook is updated to set the mon location with a new argument at mon startup time. Option 2 If the Ceph BZ cannot be fixed in time for 4.2z1: - We disable the mon failover functionality in Rook for stretch cluster scenarios. This seems like a reasonable solution for 4.7 since stretch cluster is in tech preview and we are already starting with 5 mons, which have some redundancy naturally built-in. - Disabling mon failover is a setting on the CephCluster CR that will be set by the OCS operator Moving to the Rook component for now, assuming Option 1 is still possible, with Option 2 as the fallback plan where it would move back to the OCS operator.
What is expected to happen to a stretch cluster, when one data zone loses connection to both arbiter and other data zone with mon failover disabled?
The decision was to go with option 2 since 4.2z1 does not accept anything new. Will provide a fix soon.
(In reply to Sébastien Han from comment #5) > The decision was to go with option 2 since 4.2z1 does not accept anything > new. Well, kind of right. We have agreed with the RHCS program to have an async update right after the 4.2z1 release in order to facilitate a few critical bug fixes for OCS etc. So if we had a fix in ceph already, we might still get it in. Not sure. > Will provide a fix soon.
(In reply to Martin Bukatovic from comment #2) > What is expected to happen to a stretch cluster, when one data zone loses > connection to both arbiter and other data zone with mon failover disabled? I have created a doc BZ to record the behaviour/workaround in this case. https://bugzilla.redhat.com/show_bug.cgi?id=1941918
(In reply to Martin Bukatovic from comment #2) > What is expected to happen to a stretch cluster, when one data zone loses > connection to both arbiter and other data zone with mon failover disabled? When one data zone loses connection to both the other zones: - The data zone that still has connection to the arbiter zone will be available for reads/writes - The data zone without connection to the arbiter zone will remain down until it can connect to the other zones again When mon failover is disabled, no new mons will be created in place of the failed mons. When mon failover is again enabled, new mon(s) could be started up in the failed zone as long as there are nodes available in that zone with connectivity.
Tested on vsphere cluster of 6 worker/storage nodes and 3 master nodes with: - OCP 4.7.0-0.nightly-2021-03-27-082615 - LSO 4.7.0-202103130041.p0 - OCS 4.7.0-324.ci (latest-stable-47) During verification steps, I: - created StorageCluster in arbiter mode via OCP Consol - wrote 4 GB via fio job on cephfs PV, and another 4 GB on rbd volume - selected worker node for draining, so that there is both ceph mon and osd pod running: `compute-0` (the 1st worker node) - marked the node as unschedulable: `oc adm cordon compute-0` - drained the node: `oc adm drain compute-0 --force --delete-local-data --ignore-daemonsets` - let it run for more than 15 minutes - and then finally, uncordon the node: `oc adm uncordon compute-0` Observations: - during the drain process, ceph mon-b was evicted from node/compute-0 and since this moment ceph mon-b is out of quorum - after the drain, ceph mon-b is moved to node/compute-2, where it remains running (but out of quorum, as noted above) - after the uncordon, mon-b is not moved out of node/compute-2, but remains there running out of quorum, unlike osd-3 which was down but now it's back running on it's original node/compute-0 - in the end, no pod is in CLBO state, and ceph cluster has 4 monitors in a quorum, so that it can remain operational Detail of the ceph status from toolbox pod: ``` [root@compute-0 /]# ceph status cluster: id: f3fb899a-e6d2-4eee-9984-7d94e1b554c5 health: HEALTH_WARN 3496 slow ops, oldest one blocked for 8786 sec, mon.b has slow ops 1/5 mons down, quorum a,c,d,e services: mon: 5 daemons, quorum a,c,d,e (age 40m), out of quorum: b mgr: a(active, since 6h) mds: ocs-storagecluster-cephfilesystem:1 {0=ocs-storagecluster-cephfilesystem-a=up:active} 1 up:standby-replay osd: 4 osds: 4 up (since 101m), 4 in (since 101m) rgw: 2 daemons active (ocs.storagecluster.cephobjectstore.a, ocs.storagecluster.cephobjectstore.b) task status: scrub status: mds.ocs-storagecluster-cephfilesystem-a: idle mds.ocs-storagecluster-cephfilesystem-b: idle data: pools: 10 pools, 272 pgs objects: 2.37k objects, 8.1 GiB usage: 36 GiB used, 28 GiB / 64 GiB avail pgs: 272 active+clean io: client: 4.9 KiB/s rd, 4.2 KiB/s wr, 5 op/s rd, 3 op/s wr ``` This means that OCS cluster can now survive the usecase from the bug. Besides this, I also tried to deploy a simple machineconfig on all worker nodes, as during this process, MCO drains all nodes one by one, and I hit this BZ during this use case as noted in comment https://bugzilla.redhat.com/show_bug.cgi?id=1939007#c8 ``` $ cat worker-example.yaml apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: name: worker-example labels: machineconfiguration.openshift.io/role: worker spec: config: ignition: version: 3.1.0 storage: files: - path: /etc/example contents: source: data:text/plain;charset=utf-8;base64,SGVsbG8K mode: 0444 user: name: root group: name: root overwrite: true $ oc create -f worker-example.yaml ``` And again, cluster can survive that without loosing quorum or any pod reaching CLBO status. Monitor quorum is `2/5 mons down, quorum b,d,e` though. In this state, I was able to run another workload which stored 4Gb on the cluster without any problem. Based on this, I'm marking this bug as VERIFIED. That said, while this workaround works, it also makes the cluster vulnerable to further disruptions. Especially disruptions which a stretch cluster is expected to handle would result in nonoperational cluster which is difficult to recover. I assume that this is well known limitation to the workaround proposed in comment 5.
Since there was some discussion about what exactly should be fixed in this bug, and option 2 from comment 1 is not explained here in more detail, I'm asking Travis to check if the comment 11 describes expected behaviour of the fix here.
I'm asking dev team to tell me which ceph version was present in OCS 4.7.0-324.ci image, so that I can answer the question about 4.2z1.
(In reply to Martin Bukatovic from comment #13) > I'm asking dev team to tell me which ceph version was present in OCS > 4.7.0-324.ci image, so that I can answer the question about 4.2z1. quay.io no longer provides an answer: https://quay.io/repository///manifest/sha256:73a4413f49c7cb3ef288313eaed37951e04b9ba57a4ec8bac05004f1f4b97b25 reports not found.
The unexpected behavior from comment 11 is that the mon was moved to another node after the node drain. When running on LSO, the mon should have node affinity and never move nodes. The mon should wait for the node to come back up instead of rescheduling on a different node. When you have a repro please let me know to take a look.
Based on Travis reply from comment 15 moving this back to assigned.
Travis/Sebastien, we have mon failover support in Ceph 4.2z1 now (https://bugzilla.redhat.com/show_bug.cgi?id=1939766) Shall we enable it back in OCS?
For 4.7.0 I would propose we leave mon failover disabled for arbiter. The changes to support the mon failover require another change in rook besides reverting the disabling, which will take more time to test. I don't see mon failover as critical to the feature while it's in tech preview. The important scenario is that stretch continues to work when a data zone goes down, in which case mon failover isn't even possible. @Mudit How about we move this to 4.7.z or else 4.8?
My bad, changed the wrong bug by mistake. Reverting it to the original state.
After further discussions, let's go ahead and re-enable mon failover for stretch clusters. Two changes are necessary to reenable mon failover in stretch clusters since the functionality is available from the latest RHCS 4.2z1 RC: - Revert the commit that disabled mon failover in stretch clusters - Implement a new mon parameter that sets the location of the mon in the stretch cluster
While testing the Rook changes to enable mon failover in a stretch cluster, I'm not able to get the mon to join quorum. The mon log is showing: debug 2021-04-06T22:11:06.037+0000 7f2d568fd700 10 mon.i@-1(probing) e14 ready to join, but i'm not in the monmap/my addr is blank/location is wrong, trying to join All the mons and the failed-over mon are using a location argument like so: --set-crush-location zone=us-east-2c The original 5 mons start fine with that argument, but the new mon is not joining quorum. Here is the full mon log for the mon attempting to join quorum: https://gist.github.com/travisn/5a25603c13f8fcc8c5da638433333a1a#file-mon-h-log Here is the full mon pod spec: https://gist.github.com/travisn/5a25603c13f8fcc8c5da638433333a1a#gistcomment-3695810 @gfarnum Is the location incorrect on the mon, or why would it not be joining quorum?
Here are Rook changes to use the new --set-crush-location flag. https://github.com/rook/rook/pull/7535
(In reply to Travis Nielsen from comment #22) > While testing the Rook changes to enable mon failover in a stretch cluster, > I'm not able to get the mon to join quorum. > > The mon log is showing: > > debug 2021-04-06T22:11:06.037+0000 7f2d568fd700 10 mon.i@-1(probing) e14 > ready to join, but i'm not in the monmap/my addr is blank/location is wrong, > trying to join Hmm, that log message is expected initially, at which point the monitor send an MMonJoin with the updated data that should go into the MonMap and update it. This works in my tests but perhaps you're exercising a conditional I'm not? Can you grab the log of the monitors which are in quorum? > > All the mons and the failed-over mon are using a location argument like so: > > --set-crush-location zone=us-east-2c Yeah, that should be good.
Created attachment 1770800 [details] Logs for the stretch mons when starting up mon.f to failover from mon.b mon.b was failed, and mon.f was started up to replace it with location=us-east-2b. This looks suspicious from one of the other logs, but see attached for the full mon logs. debug 2021-04-09T21:08:44.391+0000 7f575afc2700 10 mon.a@0(peon).monmap v9 preprocess_join f at [v2:172.30.1.69:3300/0,v1:172.30.1.69:6789/0] debug 2021-04-09T21:08:44.391+0000 7f575afc2700 20 is_capable service=mon command= write exec addr v2:172.30.1.69:3300/0 on cap allow * debug 2021-04-09T21:08:44.391+0000 7f575afc2700 20 allow so far , doing grant allow * debug 2021-04-09T21:08:44.391+0000 7f575afc2700 20 allow all debug 2021-04-09T21:08:44.391+0000 7f575afc2700 10 mon.a@0(peon) e9 forward_request won't forward (non-local) mon request mon_join(f [v2:172.30.1.69:3300/0,v1:172.30.1.69:6789/0] {zone=us-east-2b}) v3 debug 2021-04-09T21:08:44.411+0000 7f575afc2700 20 mon.a@0(peon) e9 _ms_dispatch existing session 0x55ef64be4000 for mon.?
(In reply to Travis Nielsen from comment #15) > The unexpected behavior from comment 11 is that the mon was moved to another > node after the node drain. When running on LSO, the mon should have node > affinity and never move nodes. The mon should wait for the node to come back > up instead of rescheduling on a different node. When you have a repro please > let me know to take a look. I tried to repeat the procedure from comment 11, but I failed to do so: - I selected a worker node for draining, so that there is both ceph mon and osd pod running: `compute-0` (the 1st worker node). - Marked the node as unschedulable: `oc adm cordon compute-0`. - Drained the node via `oc adm drain compute-0 --force --delete-local-data --ignore-daemonsets` But the drain got stuck on: ``` evicting pod openshift-storage/rook-ceph-mon-a-889497f48-6zrqx error when evicting pods/"rook-ceph-mon-a-889497f48-6zrqx" -n "openshift-storage" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget. ```
Based on Mudit's request, a BZ to track reenablement of arbiter mon failover was opened: BZ 1949165
Moving back to ON_QA per separate thread and tracking the re-enabling with the new BZ. (In reply to Martin Bukatovic from comment #26) > (In reply to Travis Nielsen from comment #15) > > The unexpected behavior from comment 11 is that the mon was moved to another > > node after the node drain. When running on LSO, the mon should have node > > affinity and never move nodes. The mon should wait for the node to come back > > up instead of rescheduling on a different node. When you have a repro please > > let me know to take a look. > > I tried to repeat the procedure from comment 11, but I failed to do so: > > - I selected a worker node for draining, so that there is both ceph mon > and osd pod running: `compute-0` (the 1st worker node). > - Marked the node as unschedulable: `oc adm cordon compute-0`. > - Drained the node via > `oc adm drain compute-0 --force --delete-local-data --ignore-daemonsets` > > But the drain got stuck on: > > ``` > evicting pod openshift-storage/rook-ceph-mon-a-889497f48-6zrqx > error when evicting pods/"rook-ceph-mon-a-889497f48-6zrqx" -n > "openshift-storage" (will retry after 5s): Cannot evict pod as it would > violate the pod's disruption budget. > ``` It's expected that a node cannot be drained with a mon if the mons are not fully in quorum. Was there a mon already out of quorum when you tried to drain the node?
(In reply to Travis Nielsen from comment #25) > Created attachment 1770800 [details] > Logs for the stretch mons when starting up mon.f to failover from mon.b > > mon.b was failed, and mon.f was started up to replace it with > location=us-east-2b. > > This looks suspicious from one of the other logs, but see attached for the > full mon logs. > > debug 2021-04-09T21:08:44.391+0000 7f575afc2700 10 mon.a@0(peon).monmap v9 > preprocess_join f at [v2:172.30.1.69:3300/0,v1:172.30.1.69:6789/0] > debug 2021-04-09T21:08:44.391+0000 7f575afc2700 20 is_capable service=mon > command= write exec addr v2:172.30.1.69:3300/0 on cap allow * > debug 2021-04-09T21:08:44.391+0000 7f575afc2700 20 allow so far , doing > grant allow * > debug 2021-04-09T21:08:44.391+0000 7f575afc2700 20 allow all > debug 2021-04-09T21:08:44.391+0000 7f575afc2700 10 mon.a@0(peon) e9 > forward_request won't forward (non-local) mon request mon_join(f > [v2:172.30.1.69:3300/0,v1:172.30.1.69:6789/0] {zone=us-east-2b}) v3 > debug 2021-04-09T21:08:44.411+0000 7f575afc2700 20 mon.a@0(peon) e9 > _ms_dispatch existing session 0x55ef64be4000 for mon.? Yep, that's definitely the issue — good eyes. Since this isn't a 4.7 blocker I think it'll have to wait, though, as just getting through tests takes some time and we have other priorities.
(In reply to Travis Nielsen from comment #28) > It's expected that a node cannot be drained with a mon if the mons are not > fully in quorum. Was there a mon already out of quorum when you tried to > drain the node? There was no problem with mon quorum if I recall right.
Ok, so now I'm little confused. Could someone from the dev team confirm what is expected to happen when one performs verification steps (as I did in comment 11): - create StorageCluster in arbiter mode via OCP Console - write some data on cephfs PV, and another 4 GB on rbd volume - select a worker node for draining, so that there is both ceph mon and osd pod running: `compute-0` (the 1st worker node) - mark the node as unschedulable: `oc adm cordon compute-0` - drained the node: `oc adm drain compute-0 --force --delete-local-data --ignore-daemonsets` - let it run for more than 15 minutes - and then finally, uncordon the node: `oc adm uncordon compute-0` I would expect that most observation from comment 11 still holds (with exception of the issue with which node the mon is deployed again), and that I should still be able to deploy a simple machineconfig on worker mcp. If that is not the case, I will move the BZ right into assigned again, and also try to perform the validation steps to gather more data points.
@Martin Correct, the same validation steps should apply. The change with this BZ is that no new mon (such as mon.f) will be created automatically after a mon is down for too long. While the node is down, you would see the mon and osd stay down from that node since they are not portable in stretch clusters built on LSO. Then when the node is brought back up, you should see the mon and osd pods running again on that node.
I'm trying to verify the problem with: - OCP: 4.7.0-0.nightly-2021-04-15-110345 - LSO 4.7.0-202104030128.p0 - OCS 4.7.0-353.ci latest-stable-47 (4.7.0-rc5) I followed steps from comment 11, but when I started drain of a node: ``` $ oc adm drain compute-2 --force --delete-local-data --ignore-daemonsets ``` The process got stuck (in the same was as reported in comment 26): ``` evicting pod openshift-storage/rook-ceph-mon-c-756c8487d6-l7ppf error when evicting pods/"rook-ceph-mon-c-756c8487d6-l7ppf" -n "openshift-storage" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget. evicting pod openshift-storage/rook-ceph-mon-c-756c8487d6-l7ppf error when evicting pods/"rook-ceph-mon-c-756c8487d6-l7ppf" -n "openshift-storage" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget. ``` Looking at PodDisruptionBudget I see that it prevents the draining: ``` $ oc get PodDisruptionBudget -n openshift-storage NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE rook-ceph-mds-ocs-storagecluster-cephfilesystem 1 N/A 1 24h rook-ceph-mon-pdb N/A 0 0 24h rook-ceph-osd-zone-data-b N/A 0 0 28m $ oc get PodDisruptionBudget/rook-ceph-mon-pdb -n openshift-storage -o yaml | tail -11 spec: maxUnavailable: 0 selector: matchLabels: app: rook-ceph-mon status: currentHealthy: 5 desiredHealthy: 5 disruptionsAllowed: 0 expectedPods: 5 observedGeneration: 8 ``` I don't want to comment whether this particular configuration of PodDisruptionBudget is ok, but it prevents me to verify the bug, because it's not possible to perform the reproducer, and check the expected behaviour. Moreover performing machine config update would be also blocked by this. >>> ASSIGNED
@Martin Since https://bugzilla.redhat.com/show_bug.cgi?id=1935065 was fixed, nodes are disallowed from being drained if a mon is down. In this test, are any mons down down before you tried to drain the node? Or is everything perfectly healthy before you try to drain the node? If everything was perfectly healthy and the mon PDB disallows a drain, then there would be an issue on 1935065 to follow up on, rather than this one. To simulate a mon going down to see if mon failover is triggered, you should be able to set the mon deployment replicas to 0 so the mon pod will stop. Did you try that?
(In reply to Travis Nielsen from comment #36) > @Martin Since https://bugzilla.redhat.com/show_bug.cgi?id=1935065 was fixed, > nodes are disallowed from being drained if a mon is down. In this test, are > any mons down down before you tried to drain the node? Or is everything > perfectly healthy before you try to drain the node? If everything was > perfectly healthy and the mon PDB disallows a drain, then there would be an > issue on 1935065 to follow up on, rather than this one. Looking in my log, I see that I did: - install storage cluster CR via OCP Console - stored 1 GB on cephfs based PV, 1GB on rbd PV - run all net splits we have in a test plan over night - reproduced BZ 1946592 That said, the cluster was healthy when I started with retesting use case from this BZ. I will retest on a fresh cluster. I agree that there is something else going on, which may require a separate bug. > To simulate a mon going down to see if mon failover is triggered, you should > be able to set the mon deployment replicas to 0 so the mon pod will stop. > Did you try that? No, I haven't. I could retry.
Here is an observation when I retested this a fresh cluster, when only reproducer steps from this bug were performed. Retested with ============= OCP 4.7.0-0.nightly-2021-04-21-093400 LSO 4.7.0-202104090228.p0 OCS 4.7.0-353.ci Observations before the drain ============================= I selected node compute-2 to be drained. ``` $ oc get pods -n openshift-storage -o wide | grep compute-2 | cut -d' ' -f1 | egrep "(mon|osd)" rook-ceph-mon-b-5b95bc7c75-zjslp rook-ceph-osd-3-778966b55f-45rx9 rook-ceph-osd-prepare-ocs-deviceset-arbiter-1-data-08d9jk-hg8qj $ oc get PodDisruptionBudget -n openshift-storage NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE rook-ceph-mds-ocs-storagecluster-cephfilesystem 1 N/A 1 78m rook-ceph-mon-pdb N/A 1 1 75m rook-ceph-osd N/A 1 1 75m $ oc get PodDisruptionBudget/rook-ceph-mon-pdb -n openshift-storage -o yaml | tail -11 spec: maxUnavailable: 1 selector: matchLabels: app: rook-ceph-mon status: currentHealthy: 5 desiredHealthy: 4 disruptionsAllowed: 1 expectedPods: 5 observedGeneration: 1 ``` After the drain =============== I see that affected mon-b was respawned on another node, while keeping quorum at 5: ``` $ oc get pods -n openshift-storage -o wide | grep mon rook-ceph-mon-a-69445d5c76-nzn45 2/2 Running 0 135m 10.128.4.194 compute-0 <none> <none> rook-ceph-mon-b-5b95bc7c75-4b8wg 2/2 Running 0 52m 10.131.0.112 compute-1 <none> <none> rook-ceph-mon-c-65f9cf6744-k82j5 2/2 Running 0 134m 10.130.2.13 compute-5 <none> <none> rook-ceph-mon-d-78b4db8c7c-t4kbv 2/2 Running 0 134m 10.129.2.234 compute-3 <none> <none> rook-ceph-mon-e-75f646fdb8-rkxwr 2/2 Running 0 134m 10.128.0.52 control-plane-2 <none> <none> $ oc get PodDisruptionBudget -n openshift-storage NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE rook-ceph-mds-ocs-storagecluster-cephfilesystem 1 N/A 1 138m rook-ceph-mon-pdb N/A 1 1 136m rook-ceph-osd-zone-data-b N/A 0 0 55m ``` After the uncordon ================== ``` $ oc get pods -n openshift-storage -o wide | grep mon rook-ceph-mon-a-69445d5c76-nzn45 2/2 Running 0 142m 10.128.4.194 compute-0 <none> <none> rook-ceph-mon-b-5b95bc7c75-4b8wg 2/2 Running 0 59m 10.131.0.112 compute-1 <none> <none> rook-ceph-mon-c-65f9cf6744-k82j5 2/2 Running 0 141m 10.130.2.13 compute-5 <none> <none> rook-ceph-mon-d-78b4db8c7c-t4kbv 2/2 Running 0 141m 10.129.2.234 compute-3 <none> <none> rook-ceph-mon-e-75f646fdb8-rkxwr 2/2 Running 0 141m 10.128.0.52 control-plane-2 <none> <none> $ oc get PodDisruptionBudget -n openshift-storage NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE rook-ceph-mds-ocs-storagecluster-cephfilesystem 1 N/A 1 142m rook-ceph-mon-pdb N/A 1 1 140m rook-ceph-osd N/A 1 1 8s ``` Mon status ========== According to `ceph_mon_quorum_status`, no mon was out of quorum. This looks like the mon failover works. Conclusion ========== The observation conflicts with description from comment 34.
(In reply to Martin Bukatovic from comment #37) > > To simulate a mon going down to see if mon failover is triggered, you should > > be able to set the mon deployment replicas to 0 so the mon pod will stop. > > Did you try that? > > No, I haven't. I could retry. When I do that, after retrying the reproducer as noted in comment 38, I see that scaling down given mon doesn't create new mon deployment which would replace the scaled down mon. ``` $ oc scale --replicas 0 deployment/rook-ceph-mon-b -n openshift-storage deployment.apps/rook-ceph-mon-b scaled $ oc get pods -n openshift-storage| grep mon rook-ceph-mon-a-69445d5c76-nzn45 2/2 Running 0 3h53m rook-ceph-mon-c-65f9cf6744-k82j5 2/2 Running 0 3h53m rook-ceph-mon-d-78b4db8c7c-t4kbv 2/2 Running 0 3h53m rook-ceph-mon-e-75f646fdb8-rkxwr 2/2 Running 0 3h52m ```
(In reply to Martin Bukatovic from comment #38) > > Mon status > ========== > > According to `ceph_mon_quorum_status`, no mon was out of quorum. > > This looks like the mon failover works. > > Conclusion > ========== > > The observation conflicts with description from comment 34. Your observations sound consistent with comment 34. To clarify: - An existing mon may still be moved to another node if another node in the same zone is available during the node drain. This is not "mon failover", but is just the same mon moving to another node. - "mon failover" in comment 34 means that a mon with a new name (such as mon-f) will be created, and the down mon (e.g. mon-b) would be destroyed by the operator after replaced. Moving back to on_qa so you can mark as verified if you agree.
Thanks for clarification. Marking as VERIFIED based on previous evidence (comment 38) and clarification from Travis. Issues I run into (comment 37) will be rechecked and if necessary, separate BZ will be reported, since it is not related to this bug.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2041