Description of problem (please be detailed as possible and provide log snippests): Relocation does not complete after a failover, The progression does not move from WaitingForResourceRestore. this might be because of not waiting for enough time for all the data to be transferred after relocate Version of all relevant components (if applicable): odf4.14 Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Is there any workaround available to the best of your knowledge? Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? Can this issue be reproducible? yes Can this issue be reproduced from the UI? If this is a regression, please provide more details to justify this: Steps to Reproduce: 1. Deploy application based on Appset, Perform failover 2. Perform Relocate 3. DRPC progression stuck at WaitingForResourceRestore The app has multiple PVCs/PV Actual results: Output of DRPC status NAME AGE PREFERREDCLUSTER FAILOVERCLUSTER DESIREDSTATE CURRENTSTATE PROGRESSION START TIME DURATION PEER READY dev-qe-volsync-placement-drpc 21h prsurve-dev-1 prsurve-dev-2 Relocate Relocating WaitingForResourceRestore 2023-09-27T08:08:56Z False Expected results: Relocat should be complete. Additional info:
Marking this bug as a blocker as this is a basic positive workflow.
Details are in the PR: https://github.com/RamenDR/ramen/pull/1087
VERIFICATION COMMENTS ===================== Steps to Reproduce: ------------------- 1. Deploy application based on Appset, Perform failover 2. Perform Relocate 3. DRPC progression stuck at WaitingForResourceRestore Verification O/P after performing relocate: ------------------------------------------- O/P on the new primary :- $ pods NAME READY STATUS RESTARTS AGE dd-io-1-5dbcfccf76-rcvfb 1/1 Running 0 65m dd-io-2-684fc84b64-m7clh 1/1 Running 0 65m dd-io-3-68bf99586d-vpfjs 1/1 Running 0 65m dd-io-4-757c8d8b7b-45rt9 1/1 Running 0 65m dd-io-5-74768ccf84-9lqg5 1/1 Running 0 65m dd-io-6-68d5769c76-cjrcd 1/1 Running 0 65m dd-io-7-67d87688b4-r7wfv 1/1 Running 0 65m $ pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE dd-io-pvc-1 Bound pvc-9d8128d6-cbb5-41f3-84df-4e1559db5036 117Gi RWO ocs-storagecluster-cephfs 9h dd-io-pvc-2 Bound pvc-6078b90b-170b-4e0b-8985-2f5edff84b4b 143Gi RWO ocs-storagecluster-cephfs 9h dd-io-pvc-3 Bound pvc-c1134ea7-3ee0-4924-a08e-d1dd14a52932 134Gi RWO ocs-storagecluster-cephfs 9h dd-io-pvc-4 Bound pvc-b9370af1-bac4-4990-90fd-54fda2ee56d2 106Gi RWO ocs-storagecluster-cephfs 9h dd-io-pvc-5 Bound pvc-3a7e387f-71c4-4543-971d-6e59d4e837b8 115Gi RWO ocs-storagecluster-cephfs 9h dd-io-pvc-6 Bound pvc-eac2029d-93c4-483f-877d-1f2d282c5a9a 129Gi RWO ocs-storagecluster-cephfs 9h dd-io-pvc-7 Bound pvc-f0bfbdbd-09e8-428a-bda2-2bdc12825965 149Gi RWO ocs-storagecluster-cephfs 9h $ oc get vrg NAME DESIREDSTATE CURRENTSTATE busybox-1-cephfs-c1-placement-drpc primary Primary $ oc get replicationsources.volsync.backube NAME SOURCE LAST SYNC DURATION NEXT SYNC dd-io-pvc-1 dd-io-pvc-1 2023-11-02T14:41:25Z 1m25.883277818s 2023-11-02T14:50:00Z dd-io-pvc-2 dd-io-pvc-2 2023-11-02T14:41:12Z 1m12.190563634s 2023-11-02T14:50:00Z dd-io-pvc-3 dd-io-pvc-3 2023-11-02T14:41:08Z 1m8.781483412s 2023-11-02T14:50:00Z dd-io-pvc-4 dd-io-pvc-4 2023-11-02T14:41:09Z 1m9.538156757s 2023-11-02T14:50:00Z dd-io-pvc-5 dd-io-pvc-5 2023-11-02T14:41:16Z 1m16.026772249s 2023-11-02T14:50:00Z dd-io-pvc-6 dd-io-pvc-6 2023-11-02T14:41:15Z 1m15.76896785s 2023-11-02T14:50:00Z dd-io-pvc-7 dd-io-pvc-7 2023-11-02T14:41:05Z 1m5.365572687s 2023-11-02T14:50:00Z On HUB:- $ oc get drpc busybox-1-cephfs-c1-placement-drpc -o yaml apiVersion: ramendr.openshift.io/v1alpha1 kind: DRPlacementControl metadata: annotations: drplacementcontrol.ramendr.openshift.io/app-namespace: appset-busybox-1-cephfs-c1 drplacementcontrol.ramendr.openshift.io/last-app-deployment-cluster: kmanohar-clu2 creationTimestamp: "2023-11-02T04:43:51Z" finalizers: - drpc.ramendr.openshift.io/finalizer generation: 2 labels: cluster.open-cluster-management.io/backup: resource name: busybox-1-cephfs-c1-placement-drpc namespace: openshift-gitops ownerReferences: - apiVersion: cluster.open-cluster-management.io/v1beta1 blockOwnerDeletion: true controller: true kind: Placement name: busybox-1-cephfs-c1-placement uid: 8064c466-f386-4bfb-b339-f12cded188a1 resourceVersion: "22589985" uid: 0fbb0832-f2de-4dc4-b7bd-802d3f6b9113 spec: action: Relocate drPolicyRef: apiVersion: ramendr.openshift.io/v1alpha1 kind: DRPolicy name: dr-policy-10m placementRef: apiVersion: cluster.open-cluster-management.io/v1beta1 kind: Placement name: busybox-1-cephfs-c1-placement namespace: openshift-gitops preferredCluster: kmanohar-clu2 pvcSelector: matchLabels: appname: busybox-cephfs status: actionDuration: 3m27.162568163s actionStartTime: "2023-11-02T13:33:07Z" conditions: - lastTransitionTime: "2023-11-02T13:36:04Z" message: Completed observedGeneration: 2 reason: Relocated status: "True" type: Available - lastTransitionTime: "2023-11-02T13:36:34Z" message: Ready observedGeneration: 2 reason: Success status: "True" type: PeerReady lastGroupSyncDuration: 1m25.883277818s lastGroupSyncTime: "2023-11-02T14:41:05Z" lastUpdateTime: "2023-11-02T14:41:48Z" phase: Relocated preferredDecision: clusterName: kmanohar-clu1 clusterNamespace: kmanohar-clu1 progression: Completed resourceConditions: conditions: - lastTransitionTime: "2023-11-02T13:36:04Z" message: All VolSync PVCs are ready observedGeneration: 4 reason: Ready status: "True" type: DataReady - lastTransitionTime: "2023-11-02T13:37:47Z" message: All VolSync PVCs are protected observedGeneration: 4 reason: DataProtected status: "True" type: DataProtected - lastTransitionTime: "2023-11-02T13:36:04Z" message: Restored cluster data observedGeneration: 4 reason: Restored status: "True" type: ClusterDataReady - lastTransitionTime: "2023-11-02T13:37:47Z" message: All VolSync PVCs are protected observedGeneration: 4 reason: DataProtected status: "True" type: ClusterDataProtected resourceMeta: generation: 4 kind: VolumeReplicationGroup name: busybox-1-cephfs-c1-placement-drpc namespace: appset-busybox-1-cephfs-c1 protectedpvcs: - dd-io-pvc-4 - dd-io-pvc-1 - dd-io-pvc-5 - dd-io-pvc-7 - dd-io-pvc-3 - dd-io-pvc-2 - dd-io-pvc-6 $ oc get drpc busybox-1-cephfs-c1-placement-drpc -o yaml | grep lastGroupSyncTime lastGroupSyncTime: "2023-11-02T14:41:05Z" $ oc get drpc NAME AGE PREFERREDCLUSTER FAILOVERCLUSTER DESIREDSTATE CURRENTSTATE busybox-1-c1-placement-drpc 16d kmanohar-clu1 Deployed busybox-1-cephfs-c1-placement-drpc 12h kmanohar-clu1 Relocate Relocated busybox-1-cephfs-c2-placement-drpc 5h4m kmanohar-clu2 Deployed busybox-2-cephfs-c1-creation-placement-drpc 12h kmanohar-clu1 Deployed Verified On ----------- ODF Version - 4.14.0-150 OCP - 4.14.0-0.nightly-2023-10-15-164249 Submariner - 0.16.0(594788) ACM - 2.9.0(2.9.0-DOWNSTREAM-2023-10-03-20-08-35) Ceph version - ceph version 17.2.6-146.el9cp (1d01c2b30b5fd39787bb8804707c4b2e52e30137) quincy (stable) Must gather for verification ---------------------------- C1 - http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/keerthana/bz-v/bz-CephFS/c1/ C2 - http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/keerthana/bz-v/bz-CephFS/c2/ HUB - http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/keerthana/bz-v/bz-CephFS/hub/
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.14.0 security, enhancement & bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:6832