Bug 2238682 - [Tracker ACM-7479][RDR] Applications are not getting DR protected
Summary: [Tracker ACM-7479][RDR] Applications are not getting DR protected
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: odf-dr
Version: 4.13
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: ODF 4.14.0
Assignee: Nir Soffer
QA Contact: Sidhant Agrawal
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-09-13 07:22 UTC by Sidhant Agrawal
Modified: 2023-11-08 18:54 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-11-08 18:54:25 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github RamenDR ramen pull 1065 0 None Draft Use kind,group,version instead of resource 2023-09-13 12:30:20 UTC
Red Hat Issue Tracker ACM-7479 0 None None None 2023-09-15 08:01:52 UTC
Red Hat Product Errata RHSA-2023:6832 0 None None None 2023-11-08 18:54:49 UTC

Description Sidhant Agrawal 2023-09-13 07:22:57 UTC
Description of problem (please be detailed as possible and provide log
snippests):
On OCP 4.14 and ODF 4.13 based RDR setup, Applications are not getting DR protected.
VR and VRG resources not created and DRPC instance creation is failing due to failed to retrieve VRGs from clusters 

ramen-hub-operator pod shows following error messages:

```
2023-09-13T06:54:35.592Z	INFO	controllers.DRPlacementControl	controllers/drplacementcontrol_controller.go:1719	Retrieved ApplicationSets	{"count": 1}
2023-09-13T06:54:35.592Z	INFO	controllers.DRPlacementControl	controllers/drplacementcontrol_controller.go:1738	Placement busybox-sub-placement-1 does not belong to any ApplicationSet. Defaulting the dest namespace to busybox-sub
2023-09-13T06:54:35.592Z	INFO	MCV	util/mcv_util.go:223	MCV Conditions: [{Processing False 0 2023-09-13 06:53:54 +0000 UTC ResourceTypeInvalid failed to get resource with err: the server doesn't have a resource type "VolumeReplicationGroup"}]	{"resourceName": "busybox-sub-placement-1-drpc"}
2023-09-13T06:54:35.593Z	INFO	controllers.DRPlacementControl	controllers/drplacementcontrol_controller.go:1354	failed to retrieve VRG from sagrawal-nc1. err (getManagedClusterResource results: ManagedClusterView is not ready (reason: ResourceTypeInvalid))	{"DRPC": "busybox-sub/busybox-sub-placement-1-drpc", "rid": "14fb7408-7acd-41ef-bc7e-b78528e1170d"}
2023-09-13T06:54:35.593Z	INFO	MCV	util/mcv_util.go:223	MCV Conditions: [{Processing False 0 2023-09-13 06:53:54 +0000 UTC ResourceTypeInvalid failed to get resource with err: the server doesn't have a resource type "VolumeReplicationGroup"}]	{"resourceName": "busybox-sub-placement-1-drpc"}
2023-09-13T06:54:35.593Z	INFO	controllers.DRPlacementControl	controllers/drplacementcontrol_controller.go:1354	failed to retrieve VRG from sagrawal-nc2. err (getManagedClusterResource results: ManagedClusterView is not ready (reason: ResourceTypeInvalid))	{"DRPC": "busybox-sub/busybox-sub-placement-1-drpc", "rid": "14fb7408-7acd-41ef-bc7e-b78528e1170d"}
2023-09-13T06:54:35.593Z	INFO	controllers.DRPlacementControl	controllers/drplacementcontrol_controller.go:1455	Updating DRPC status	{"DRPC": "busybox-sub/busybox-sub-placement-1-drpc", "rid": "14fb7408-7acd-41ef-bc7e-b78528e1170d"}
2023-09-13T06:54:35.593Z	INFO	controllers.DRPlacementControl	controllers/drplacementcontrol_controller.go:1719	Retrieved ApplicationSets	{"count": 1}
2023-09-13T06:54:35.593Z	INFO	controllers.DRPlacementControl	controllers/drplacementcontrol_controller.go:1738	Placement busybox-sub-placement-1 does not belong to any ApplicationSet. Defaulting the dest namespace to busybox-sub
2023-09-13T06:54:35.593Z	INFO	controllers.DRPlacementControl	controllers/drplacementcontrol_controller.go:1588	Found ClusterDecision	{"ClsDedicision": [{"clusterName":"sagrawal-nc1","reason":""}]}
2023-09-13T06:54:35.593Z	INFO	MCV	util/mcv_util.go:223	MCV Conditions: [{Processing False 0 2023-09-13 06:53:54 +0000 UTC ResourceTypeInvalid failed to get resource with err: the server doesn't have a resource type "VolumeReplicationGroup"}]	{"resourceName": "busybox-sub-placement-1-drpc"}
2023-09-13T06:54:35.593Z	INFO	controllers.DRPlacementControl	controllers/drplacementcontrol_controller.go:1504	Failed to get VRG from managed cluster	{"DRPC": "busybox-sub/busybox-sub-placement-1-drpc", "rid": "14fb7408-7acd-41ef-bc7e-b78528e1170d", "errMsg": "getManagedClusterResource results: ManagedClusterView is not ready (reason: ResourceTypeInvalid)", "errMsgVerbose": "ManagedClusterView is not ready (reason: ResourceTypeInvalid)\ngetManagedClusterResource results\ngithub.com/ramendr/ramen/controllers/util.ManagedClusterViewGetterImpl.GetResource\n\t/remote-source/app/controllers/util/mcv_util.go:278\ngithub.com/ramendr/ramen/controllers/util.ManagedClusterViewGetterImpl.getManagedClusterResource\n\t/remote-source/app/controllers/util/mcv_util.go:225\ngithub.com/ramendr/ramen/controllers/util.ManagedClusterViewGetterImpl.GetVRGFromManagedCluster\n\t/remote-source/app/controllers/util/mcv_util.go:81\ngithub.com/ramendr/ramen/controllers.(*DRPlacementControlReconciler).updateResourceCondition\n\t/remote-source/app/controllers/drplacementcontrol_controller.go:1501\ngithub.com/ramendr/ramen/controllers.(*DRPlacementControlReconciler).updateDRPCStatus\n\t/remote-source/app/controllers/drplacementcontrol_controller.go:1464\ngithub.com/ramendr/ramen/controllers.(*DRPlacementControlReconciler).Reconcile\n\t/remote-source/app/controllers/drplacementcontrol_controller.go:645\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.5/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.5/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.5/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.5/pkg/internal/controller/controller.go:235\nruntime.goexit\n\t/usr/lib/golang/src/runtime/asm_amd64.s:1594"}
2023-09-13T06:54:35.593Z	INFO	controllers.DRPlacementControl	controllers/drplacementcontrol_controller.go:1474	No need to update DRPC Status	{"DRPC": "busybox-sub/busybox-sub-placement-1-drpc", "rid": "14fb7408-7acd-41ef-bc7e-b78528e1170d"}
2023-09-13T06:54:35.593Z	INFO	controllers.DRPlacementControl	controllers/drplacementcontrol_controller.go:647	Exiting reconcile loop	{"DRPC": "busybox-sub/busybox-sub-placement-1-drpc", "rid": "14fb7408-7acd-41ef-bc7e-b78528e1170d"}
2023-09-13T06:54:35.593Z	ERROR	controller/controller.go:329	Reconciler error	{"controller": "drplacementcontrol", "controllerGroup": "ramendr.openshift.io", "controllerKind": "DRPlacementControl", "DRPlacementControl": {"name":"busybox-sub-placement-1-drpc","namespace":"busybox-sub"}, "namespace": "busybox-sub", "name": "busybox-sub-placement-1-drpc", "reconcileID": "41c0b440-cc09-4411-baa8-de7b974697b2", "error": "failed to create DRPC instance (failed to retrieve VRGs from clusters) and (<nil>)"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.5/pkg/internal/controller/controller.go:329
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.5/pkg/internal/controller/controller.go:274
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.5/pkg/internal/controller/controller.go:235
```


Version of all relevant components (if applicable):
OCP: 4.14.0-ec.4
ODF: 4.13.3-2
(ceph version 17.2.6-100.el9cp (ea4e3ef8df2cf26540aae06479df031dcfc80343) quincy (stable))
ACM: 2.9.0-132 (2.9.0-DOWNSTREAM-2023-09-11-15-47-23)
Submariner: 0.16.0 (brew.registry.redhat.io/rh-osbs/iib:558637)
VolSync: 0.7.4

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Yes

Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
3/3 on same setup

Can this issue reproduce from the UI?
Yes

If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Deploy RDR setup using OCP 4.14
2. Deploy DR workload and assign DR policy via UI
3. Observe the DRPC status

NAMESPACE          NAME                           AGE   PREFERREDCLUSTER   FAILOVERCLUSTER   DESIREDSTATE   CURRENTSTATE   PROGRESSION   START TIME   DURATION   PEER READY
busybox-sub        busybox-sub-placement-1-drpc   19m   sagrawal-nc1
openshift-gitops   busybox-1-placement-drpc       25m   sagrawal-nc1

Actual results:
Applications are not getting DR protected

Expected results:
Applications should get DR protected

Additional info:

From hub cluster:
----------------
$ oc get csv -n openshift-operators
NAME                                          DISPLAY                         VERSION        REPLACES                                      PHASE
odf-multicluster-orchestrator.v4.13.3-rhodf   ODF Multicluster Orchestrator   4.13.3-rhodf   odf-multicluster-orchestrator.v4.13.2-rhodf   Succeeded
odr-hub-operator.v4.13.3-rhodf                Openshift DR Hub Operator       4.13.3-rhodf   odr-hub-operator.v4.13.2-rhodf                Succeeded
openshift-gitops-operator.v1.9.1              Red Hat OpenShift GitOps        1.9.1          openshift-gitops-operator.v1.9.0              Succeeded

$ oc get drpolicy
NAME             AGE
odr-policy-10m   21m
odr-policy-5m    122m

$ oc get drpolicy odr-policy-5m -o jsonpath='{.status.conditions[].reason}{"\n"}'
Succeeded

$ oc get drpolicy odr-policy-10m -o jsonpath='{.status.conditions[].reason}{"\n"}'
Succeeded

$ oc get drclusters
NAME           AGE
sagrawal-nc1   112m
sagrawal-nc2   112m

$ oc get drpc -A -o wide
NAMESPACE          NAME                           AGE   PREFERREDCLUSTER   FAILOVERCLUSTER   DESIREDSTATE   CURRENTSTATE   PROGRESSION   START TIME   DURATION   PEER READY
busybox-sub        busybox-sub-placement-1-drpc   22m   sagrawal-nc1
openshift-gitops   busybox-1-placement-drpc       28m   sagrawal-nc1


C1:
$ oc get csv -n openshift-storage
NAME                                    DISPLAY                         VERSION        REPLACES                                PHASE
mcg-operator.v4.13.3-rhodf              NooBaa Operator                 4.13.3-rhodf   mcg-operator.v4.13.2-rhodf              Succeeded
ocs-operator.v4.13.3-rhodf              OpenShift Container Storage     4.13.3-rhodf   ocs-operator.v4.13.2-rhodf              Succeeded
odf-csi-addons-operator.v4.13.3-rhodf   CSI Addons                      4.13.3-rhodf   odf-csi-addons-operator.v4.13.2-rhodf   Succeeded
odf-operator.v4.13.3-rhodf              OpenShift Data Foundation       4.13.3-rhodf   odf-operator.v4.13.2-rhodf              Succeeded
odr-cluster-operator.v4.13.3-rhodf      Openshift DR Cluster Operator   4.13.3-rhodf   odr-cluster-operator.v4.13.2-rhodf      Succeeded
volsync-product.v0.7.4                  VolSync                         0.7.4          volsync-product.v0.7.3                  Succeeded

C2:
$ oc get csv -n openshift-storage
NAME                                    DISPLAY                         VERSION        REPLACES                                PHASE
mcg-operator.v4.13.3-rhodf              NooBaa Operator                 4.13.3-rhodf   mcg-operator.v4.13.2-rhodf              Succeeded
ocs-operator.v4.13.3-rhodf              OpenShift Container Storage     4.13.3-rhodf   ocs-operator.v4.13.2-rhodf              Succeeded
odf-csi-addons-operator.v4.13.3-rhodf   CSI Addons                      4.13.3-rhodf   odf-csi-addons-operator.v4.13.2-rhodf   Succeeded
odf-operator.v4.13.3-rhodf              OpenShift Data Foundation       4.13.3-rhodf   odf-operator.v4.13.2-rhodf              Succeeded
odr-cluster-operator.v4.13.3-rhodf      Openshift DR Cluster Operator   4.13.3-rhodf   odr-cluster-operator.v4.13.2-rhodf      Succeeded
volsync-product.v0.7.4                  VolSync                         0.7.4          volsync-product.v0.7.3                  Succeeded

Mirroring status on both managed clusters:
$ oc get cephblockpool ocs-storagecluster-cephblockpool -n openshift-storage -o jsonpath='{.status.mirroringStatus.summary}{"\n"}'
{"daemon_health":"OK","health":"OK","image_health":"OK","states":{}}


Resources on managed cluster C1(sagrawal-nc1):
```
$ oc get pvc,vr,vrg -n busybox-1
NAME                                   STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS                  AGE
persistentvolumeclaim/busybox-pvc-1    Bound    pvc-49ee2f9e-09b1-4006-848b-ce1b47a08771   94Gi       RWO            ocs-storagecluster-ceph-rbd   32m
persistentvolumeclaim/busybox-pvc-10   Bound    pvc-dad322bc-723a-4d14-ac75-9832b4e7cc32   87Gi       RWO            ocs-storagecluster-ceph-rbd   32m
persistentvolumeclaim/busybox-pvc-11   Bound    pvc-dd0b16a5-e201-4378-96a2-733a5f9e688c   33Gi       RWO            ocs-storagecluster-ceph-rbd   32m
persistentvolumeclaim/busybox-pvc-12   Bound    pvc-04bf1eae-85ab-42d4-9428-7348dc281ef6   147Gi      RWO            ocs-storagecluster-ceph-rbd   32m
persistentvolumeclaim/busybox-pvc-13   Bound    pvc-b22fca85-c1c3-4edd-ab0f-1b7c8bb60497   77Gi       RWO            ocs-storagecluster-ceph-rbd   32m
persistentvolumeclaim/busybox-pvc-14   Bound    pvc-8b1d7875-f110-4f74-ace9-7fa5cdb33e6a   70Gi       RWO            ocs-storagecluster-ceph-rbd   32m
persistentvolumeclaim/busybox-pvc-15   Bound    pvc-0f6be1b2-df9b-4eb2-90d5-515545685fbe   131Gi      RWO            ocs-storagecluster-ceph-rbd   32m
persistentvolumeclaim/busybox-pvc-16   Bound    pvc-54e8af82-d41e-4b65-b536-be6bc4c89d5d   127Gi      RWO            ocs-storagecluster-ceph-rbd   32m
persistentvolumeclaim/busybox-pvc-17   Bound    pvc-a9e1390a-0197-43d1-bef9-0f0d3aec885f   58Gi       RWO            ocs-storagecluster-ceph-rbd   32m
persistentvolumeclaim/busybox-pvc-18   Bound    pvc-b785bc8d-aee1-4ead-a268-639b14042516   123Gi      RWO            ocs-storagecluster-ceph-rbd   32m
persistentvolumeclaim/busybox-pvc-19   Bound    pvc-1a9ed09a-27d1-4adc-8d2b-8a4eed8aa839   61Gi       RWO            ocs-storagecluster-ceph-rbd   32m
persistentvolumeclaim/busybox-pvc-2    Bound    pvc-230148b5-ef10-4009-b5bc-2785e294777d   44Gi       RWO            ocs-storagecluster-ceph-rbd   32m
persistentvolumeclaim/busybox-pvc-20   Bound    pvc-e455ffa9-4c60-4552-86cf-2f1a94a6a3cf   33Gi       RWO            ocs-storagecluster-ceph-rbd   32m
persistentvolumeclaim/busybox-pvc-3    Bound    pvc-207fb442-86f8-4632-b427-9ceb4811e30d   76Gi       RWO            ocs-storagecluster-ceph-rbd   32m
persistentvolumeclaim/busybox-pvc-4    Bound    pvc-14bae48f-57e7-4c95-bd36-c7d8f0555471   144Gi      RWO            ocs-storagecluster-ceph-rbd   32m
persistentvolumeclaim/busybox-pvc-5    Bound    pvc-e686c152-44da-4087-9592-09e1e4063be9   107Gi      RWO            ocs-storagecluster-ceph-rbd   32m
persistentvolumeclaim/busybox-pvc-6    Bound    pvc-556586d6-e1ca-4008-adab-d7b010a48608   123Gi      RWO            ocs-storagecluster-ceph-rbd   32m
persistentvolumeclaim/busybox-pvc-7    Bound    pvc-c073de3c-e888-4ca1-a070-6015de226478   90Gi       RWO            ocs-storagecluster-ceph-rbd   32m
persistentvolumeclaim/busybox-pvc-8    Bound    pvc-ba31f88b-4441-4198-a067-34153bcc9664   91Gi       RWO            ocs-storagecluster-ceph-rbd   32m
persistentvolumeclaim/busybox-pvc-9    Bound    pvc-1246d233-01d8-4df4-bf1c-cf45baed3ba9   111Gi      RWO            ocs-storagecluster-ceph-rbd   32m


$ oc get pvc,vr,vrg -n busybox-sub
NAME                                   STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS                  AGE
persistentvolumeclaim/busybox-pvc-41   Bound    pvc-1bd9ea3b-300d-4cac-aa4b-262de57079e2   42Gi       RWO            ocs-storagecluster-ceph-rbd   25m
persistentvolumeclaim/busybox-pvc-42   Bound    pvc-8faab08b-445f-459e-8408-ad304653bd84   81Gi       RWO            ocs-storagecluster-ceph-rbd   25m
persistentvolumeclaim/busybox-pvc-43   Bound    pvc-a3adda2b-853d-45f2-840e-e08ab2e405a4   28Gi       RWO            ocs-storagecluster-ceph-rbd   25m
persistentvolumeclaim/busybox-pvc-44   Bound    pvc-a92b05a9-bcde-4744-95bb-0ed175fb9234   118Gi      RWO            ocs-storagecluster-ceph-rbd   25m
persistentvolumeclaim/busybox-pvc-45   Bound    pvc-d2dff2ea-aaed-405c-858d-a9850128b560   19Gi       RWO            ocs-storagecluster-ceph-rbd   25m
persistentvolumeclaim/busybox-pvc-46   Bound    pvc-e71de134-5d2a-4165-9813-f82c05b4e8a6   129Gi      RWO            ocs-storagecluster-ceph-rbd   25m
persistentvolumeclaim/busybox-pvc-47   Bound    pvc-60c4f90c-8887-46c1-af1a-a949690f9d29   43Gi       RWO            ocs-storagecluster-ceph-rbd   25m
persistentvolumeclaim/busybox-pvc-48   Bound    pvc-2d1ca40f-6450-4e2c-b8c5-a3556061a10d   57Gi       RWO            ocs-storagecluster-ceph-rbd   25m
persistentvolumeclaim/busybox-pvc-49   Bound    pvc-65a8b6b5-e8bd-4acb-b555-4909566ec1d8   89Gi       RWO            ocs-storagecluster-ceph-rbd   25m
persistentvolumeclaim/busybox-pvc-50   Bound    pvc-35a53ce4-79bf-4cd3-b85a-4181c68f8293   124Gi      RWO            ocs-storagecluster-ceph-rbd   25m
persistentvolumeclaim/busybox-pvc-51   Bound    pvc-b7c7a885-e77f-4af8-b820-a387478c76ef   95Gi       RWO            ocs-storagecluster-ceph-rbd   25m
persistentvolumeclaim/busybox-pvc-52   Bound    pvc-2ab50848-b7ff-4322-84dd-7478c2dadd39   129Gi      RWO            ocs-storagecluster-ceph-rbd   25m
persistentvolumeclaim/busybox-pvc-53   Bound    pvc-456bec5f-e5f5-45ad-b887-dcc2b9b5779e   51Gi       RWO            ocs-storagecluster-ceph-rbd   25m
persistentvolumeclaim/busybox-pvc-54   Bound    pvc-4ffe97c8-4f9e-4ed2-b227-c5c88a9afe32   30Gi       RWO            ocs-storagecluster-ceph-rbd   25m
persistentvolumeclaim/busybox-pvc-55   Bound    pvc-5c6362b5-f659-46ad-b95a-791498a2b600   102Gi      RWO            ocs-storagecluster-ceph-rbd   25m
persistentvolumeclaim/busybox-pvc-56   Bound    pvc-31a80187-6bd5-4b2e-8c6f-b1e5b456a341   40Gi       RWO            ocs-storagecluster-ceph-rbd   25m
persistentvolumeclaim/busybox-pvc-57   Bound    pvc-93742198-4387-4413-b115-ef678afaa4df   146Gi      RWO            ocs-storagecluster-ceph-rbd   25m
persistentvolumeclaim/busybox-pvc-58   Bound    pvc-84aefbf6-d9e7-4f89-a933-9dfbc8ef3cfb   63Gi       RWO            ocs-storagecluster-ceph-rbd   25m
persistentvolumeclaim/busybox-pvc-59   Bound    pvc-62c32376-e6f5-4442-96e5-3002f95a44c2   118Gi      RWO            ocs-storagecluster-ceph-rbd   25m
persistentvolumeclaim/busybox-pvc-60   Bound    pvc-fa370964-eecd-4abb-80a5-c5e18c9ea5e0   25Gi       RWO            ocs-storagecluster-ceph-rbd   25m
```

Comment 8 Nir Soffer 2023-09-13 13:14:04 UTC
Looks like this is a known ACM issue, tracked in:
https://issues.redhat.com/browse/ACM-7479

Comment 19 errata-xmlrpc 2023-11-08 18:54:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.14.0 security, enhancement & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:6832


Note You need to log in before you can comment on or make changes to this bug.