Bug 2267965
Summary: | [4.16][RDR][Hub Recovery] Failover remains stuck with WaitForReadiness | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Karolin Seeger <kseeger> |
Component: | odf-dr | Assignee: | Benamar Mekhissi <bmekhiss> |
odf-dr sub component: | ramen | QA Contact: | Aman Agrawal <amagrawa> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | urgent | ||
Priority: | unspecified | CC: | amagrawa, bmekhiss, gshanmug, kramdoss, kseeger, muagarwa, rtalur |
Version: | 4.15 | ||
Target Milestone: | --- | ||
Target Release: | ODF 4.16.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | 4.16.0-86 | Doc Type: | No Doc Update |
Doc Text: | Story Points: | --- | |
Clone Of: | 2264767 | Environment: | |
Last Closed: | 2024-07-17 13:14:59 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 2264767 | ||
Bug Blocks: |
Description
Karolin Seeger
2024-03-05 18:31:41 UTC
Tested with following versions: ceph version 18.2.1-188.el9cp (b1ae9c989e2f41dcfec0e680c11d1d9465b1db0e) reef (stable) OCP 4.16.0-0.nightly-2024-05-23-173505 ACM 2.11.0-DOWNSTREAM-2024-05-23-15-16-26 MCE 2.6.0-104 ODF 4.16.0-108.stable Gitops v1.12.3 Platform- VMware When the steps to reproduce is repeated, Failover was successful for all RBD and CephFS workloads and VolumeReplicationClass was successfully restored on the surviving managed cluster (which is needed for RBD). oc get volumereplicationclass -A NAME PROVISIONER rbd-volumereplicationclass-1625360775 openshift-storage.rbd.csi.ceph.com rbd-volumereplicationclass-473128587 openshift-storage.rbd.csi.ceph.com DRPC from new hub- busybox-workloads-101 rbd-sub-busybox101-placement-1-drpc 4h51m amagrawa-c1-28my amagrawa-c2-my28 Failover FailedOver Cleaning Up 2024-05-30T15:26:02Z False busybox-workloads-13 cephfs-sub-busybox13-placement-1-drpc 4h51m amagrawa-c1-28my amagrawa-c2-my28 Failover FailedOver Cleaning Up 2024-05-30T15:27:33Z False busybox-workloads-16 cephfs-sub-busybox16-placement-1-drpc 4h51m amagrawa-c1-28my amagrawa-c2-my28 Failover FailedOver Cleaning Up 2024-05-30T15:27:26Z False busybox-workloads-18 cnv-sub-busybox18-placement-1-drpc 4h51m amagrawa-c1-28my amagrawa-c2-my28 Failover FailedOver Cleaning Up 2024-05-30T16:52:14Z False busybox-workloads-5 rbd-sub-busybox5-placement-1-drpc 4h51m amagrawa-c1-28my amagrawa-c2-my28 Failover FailedOver Cleaning Up 2024-05-30T15:25:50Z False busybox-workloads-6 rbd-sub-busybox6-placement-1-drpc 4h51m amagrawa-c1-28my amagrawa-c2-my28 Failover FailedOver Cleaning Up 2024-05-30T15:25:56Z False busybox-workloads-7 rbd-sub-busybox7-placement-1-drpc 4h51m amagrawa-c1-28my amagrawa-c2-my28 Failover FailedOver Cleaning Up 2024-05-30T15:25:34Z False openshift-gitops cephfs-appset-busybox12-placement-drpc 4h51m amagrawa-c1-28my amagrawa-c2-my28 Failover FailedOver Cleaning Up 2024-05-30T15:28:14Z False openshift-gitops cephfs-appset-busybox9-placement-drpc 4h51m amagrawa-c1-28my amagrawa-c2-my28 Failover FailedOver Cleaning Up 2024-05-30T15:28:19Z False openshift-gitops cnv-appset-busybox17-placement-drpc 4h51m amagrawa-c1-28my amagrawa-c2-my28 Failover FailedOver Cleaning Up 2024-05-30T16:52:23Z False openshift-gitops rbd-appset-busybox1-placement-drpc 4h51m amagrawa-c1-28my amagrawa-c2-my28 Failover FailedOver Cleaning Up 2024-05-30T15:26:08Z False openshift-gitops rbd-appset-busybox100-placement-drpc 4h51m amagrawa-c1-28my amagrawa-c2-my28 Failover FailedOver Cleaning Up 2024-05-30T15:26:14Z False openshift-gitops rbd-appset-busybox2-placement-drpc 4h51m amagrawa-c1-28my amagrawa-c2-my28 Failover FailedOver Cleaning Up 2024-05-30T15:26:20Z False openshift-gitops rbd-appset-busybox3-placement-drpc 4h51m amagrawa-c1-28my amagrawa-c2-my28 Failover FailedOver Cleaning Up 2024-05-30T15:26:49Z False Since the primary managed cluster is still down, PROGRESSION is reporting Cleaning Up which is expected. Failover was successful on 2 CNV (RBD) workloads cnv-sub-busybox18-placement-1-drpc and cnv-appset-busybox17-placement-drpc as well of both subscription and appset (pull model) types respectively and the data written into the VM was successfully restored after failover completion. Fix for this BZ LGTM. Therefore I am marking this bug as verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.16.0 security, enhancement & bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2024:4591 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days |