Description of problem (please be detailed as possible and provide log snippests): After shutting down zone hosting c1 and h1 cluster and performing hub recovery to h2. Peer Ready status of subscription and appset apps in their drpc is in "Unknown" state. NAMESPACE NAME AGE PREFERREDCLUSTER FAILOVERCLUSTER DESIREDSTATE CURRENTSTATE PROGRESSION START TIME DURATION PEER READY b-sub-1 b-sub-1-placement-1-drpc 59m pbyregow-c1 pbyregow-c2 Relocate Unknown b-sub-2 b-sub-2-placement-1-drpc 59m pbyregow-c1 pbyregow-c2 Relocate Unknown cronjob-sub-1 cronjob-sub-1-placement-1-drpc 59m pbyregow-c1 pbyregow-c2 Relocate Unknown openshift-gitops b-app-1-placement-drpc 59m pbyregow-c1 pbyregow-c2 Relocate Unknown openshift-gitops b-app-2-placement-drpc 59m pbyregow-c1 pbyregow-c2 Relocate Unknown openshift-gitops cronjob-app-1-placement-drpc 59m pbyregow-c1 pbyregow-c2 Relocate Unknown openshift-gitops job-app-1-placement-drpc 59m pbyregow-c1 pbyregow-c2 Relocate Unknown Version of all relevant components (if applicable): ocp: 4.13.0-0.nightly-2023-05-30-074322 odf/mco: 4.13.0-207 ACM: 2.7.4 Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? no Is there any workaround available to the best of your knowledge? no Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 4 Can this issue reproducible? Can this issue reproduce from the UI? If this is a regression, please provide more details to justify this: Steps to Reproduce: 1. Create 4 OCP clusters such that 2 hubs and 2 managed clusters. And one stretched RHCS cluster. Deploy cluster in such a way that zone a: arbiter ceph node zone b: c1, h1, 3 ceph nodes zone c: c2, h2, 3 ceph nodes 2. Configure MDR and deploy applications(appset and subscription) on each managed clusters. Apply drpolicy to all apps. 3. Initiate a backup process, such that the active and passive hubs are in sync 4. Made zone b down, ie c1, h1 and 3 ceph nodes 5. Initiate the restore process on h2 6. Restore succeeded in new-hub, dr policy on h2 in validated state 7. Check drpc of c1 apps Actual results: Peer ready status of c1 apps is Unknown Expected results: Peer ready status c1 apps should be True Additional info:
Not a 4.13 blocker, moving it out
PR is here: https://github.com/RamenDR/ramen/pull/920
PR is merged, moving it to ON_QA.
Tested versions: ---------------- OCP - 4.14.0-0.nightly-2023-10-08-220853 ODF - 4.14.0-146.stable ACM - 2.9.0-180 Test Steps: ------------ 1. Create 4 OCP clusters such that 2 hubs and 2 managed clusters. And one stretched RHCS cluster. Deploy cluster in such a way that zone a: arbiter ceph node zone b: c1, h1, 3 ceph nodes zone c: c2, h2, 3 ceph nodes 2. Configure MDR and deploy applications(appset and subscription) on each managed clusters. Apply drpolicy to all apps. 3. Initiate a backup process, such that the active and passive hubs are in sync 4. Made zone b down, ie c1, h1 and 3 ceph nodes 5. Initiate the restore process on h2 6. Restore succeeded in new-hub, dr policy on h2 in validated state 7. Check drpc of c1 apps Validation: ------------ Peer ready status of apps is displayed as True/false not an unknown after hub recovery DRPC O/P: --------- sraghave:~$ oc get drpc -A -o wide NAMESPACE NAME AGE PREFERREDCLUSTER FAILOVERCLUSTER DESIREDSTATE CURRENTSTATE PROGRESSION START TIME DURATION PEER READY cephfs1 cephfs1-placement-3-drpc 18h sraghave-c1-oct sraghave-c2-oct Failover FailedOver Completed 2023-10-19T19:00:01Z 24m4.988864425s True cephfs2 cephfs2-placement-3-drpc 18h sraghave-c2-oct Deployed Completed True daemonset1 daemonset1-placement-3-drpc 18h sraghave-c1-oct sraghave-c2-oct Failover FailedOver Completed 2023-10-19T19:00:24Z 22m42.043971686s True deployment1 deployment1-placement-3-drpc 16h sraghave-c1-oct Deployed Completed True openshift-gitops cephfs-appset1-placement-drpc 18h sraghave-c1-oct sraghave-c2-oct Failover FailedOver Completed True openshift-gitops cephfs-placement-drpc 18h sraghave-c2-oct Deployed Completed True openshift-gitops cephfs1-app-placement-drpc 18h sraghave-c1-oct Deployed Completed True openshift-gitops cephfs2-app-placement-drpc 18h sraghave-c2-oct Deployed Completed True openshift-gitops deployment1-app-placement-drpc 18h sraghave-c1-oct Deployed Completed True openshift-gitops deployment2-app-placement-drpc 18h sraghave-c2-oct Deployed Completed True openshift-gitops hello-appsets1-placement-drpc 18h sraghave-c1-oct sraghave-c2-oct Failover FailedOver Completed True openshift-gitops hello1-app-placement-drpc 18h sraghave-c1-oct Deployed Completed True openshift-gitops hello2-app-placement-drpc 18h sraghave-c2-oct Deployed Completed True openshift-gitops helloworld-placement-drpc 18h sraghave-c2-oct Deployed Completed True openshift-gitops rbd-appset1-placement-drpc 18h sraghave-c1-oct sraghave-c2-oct Failover FailedOver Completed True openshift-gitops rbd-placement-drpc 18h sraghave-c2-oct Deployed Completed True openshift-gitops rbd-sample-placement-drpc 18h sraghave-c1-oct sraghave-c2-oct Failover FailedOver Cleaning Up 2023-10-20T08:31:45Z False openshift-gitops rbd2-app-placement-drpc 18h sraghave-c2-oct Deployed Completed True With above observations moving the BZ to Verified
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days