Bug 2276204 - [RDR] [Hub recovery] [Co-situated] CURRENTSTATE and PROGRESSION takes longer to retain their status post hub recovery
Summary: [RDR] [Hub recovery] [Co-situated] CURRENTSTATE and PROGRESSION takes longer ...
Keywords:
Status: POST
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: odf-dr
Version: 4.15
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: Benamar Mekhissi
QA Contact: krishnaram Karthick
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-04-20 13:32 UTC by Aman Agrawal
Modified: 2024-09-27 14:32 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github RamenDR ramen pull 1578 0 None open Refactor and Enhance DRPC Reconciliation with DRPolicy Watching 2024-09-26 19:39:06 UTC

Description Aman Agrawal 2024-04-20 13:32:41 UTC
Description of problem (please be detailed as possible and provide log
snippests):


Version of all relevant components (if applicable):

ACM 2.10.1 GA'ed
MCE 2.5.2
ODF 4.15.1-1
ceph version 17.2.6-196.el9cp (cbbf2cfb549196ca18c0c9caff9124d83ed681a4) quincy (stable)
OCP 4.15.0-0.nightly-2024-04-07-120427
Submariner 0.17.0 GA'ed
VolSync 0.9.1

Platform- VMware


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible?


Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
*****Active hub co-situated with primary managed cluster*****

1. When we have multiple workloads (RBD and CephFS) of both subscription and appset types (pull model) and in different states Deployed, FailedOver, Relocated which were running on primary managed cluster goes down (C1) along with 
active hub during site failure at site-1, perform hub recovery and move to passive hub at site-2 (which is co-situated with secondary managed cluster C2).
2. Ensure the available managed cluster C2 is successfully imported on the RHACM console of the passive hub, and DRPolicy gets validated.
2. After DRPC is restored, failover all the workloads to available managed cluster C2.
3. When failover is successful, recover the down managed cluster C1 and ensure it's successfully cleaned.
4. Let IOs continue for some time and configure another hub cluster at site-1 to perform hub recovery one more time.
5. Now relocate all the workloads to the managed cluster C1 (which was recovered post disaster).
6. Perform hub recovery by bringing current active hub at site-2 and C1 cluster down at site-1.
7. When moved to new hub at site-1, ensure available managed cluster C2 is successfully imported on the RHACM console of the passive hub, and DRPolicy gets validated.
8. When drpc is restored, check the CURRENTSTATE and PROGRESSION state of the workloads which were running on down cluster C1 and monitor the time it takes to rebuild the state.


Actual results: CURRENTSTATE and PROGRESSION takes longer to retain their status post hub recovery


For step 7, DRPolicy was validated on new hub at site-1 around
amanagrawal@Amans-MacBook-Pro ~ % date -u
Sat Apr 20 12:48:46 UTC 2024


This is the drpc status after that:


amanagrawal@Amans-MacBook-Pro ~ % while true; date -u; do drpc; echo "*****************************************"; sleep 5; done 
Sat Apr 20 12:50:18 UTC 2024
NAMESPACE              NAME                                     AGE     PREFERREDCLUSTER    FAILOVERCLUSTER     DESIREDSTATE   CURRENTSTATE   PROGRESSION   START TIME   DURATION   PEER READY
busybox-workloads-10   rbd-sub-busybox10-placement-1-drpc       4m44s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
busybox-workloads-11   rbd-sub-busybox11-placement-1-drpc       4m44s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
busybox-workloads-12   rbd-sub-busybox12-placement-1-drpc       4m43s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
busybox-workloads-13   cephfs-sub-busybox13-placement-1-drpc    4m43s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
busybox-workloads-14   cephfs-sub-busybox14-placement-1-drpc    4m42s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
busybox-workloads-15   cephfs-sub-busybox15-placement-1-drpc    4m43s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
busybox-workloads-16   cephfs-sub-busybox16-placement-1-drpc    4m44s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
busybox-workloads-23   cephfs-sub-busybox23-placement-1-drpc    4m42s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
busybox-workloads-24   rbd-sub-busybox24-placement-1-drpc       4m42s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
busybox-workloads-27   cephfs-sub-busybox27-placement-1-drpc    4m43s   amagrawa-c2-13apr   amagrawa-c1-13apr   Failover       WaitForUser    Paused                                True
busybox-workloads-28   rbd-sub-busybox28-placement-1-drpc       4m43s   amagrawa-c2-13apr   amagrawa-c1-13apr   Failover       WaitForUser    Paused                                True
busybox-workloads-9    rbd-sub-busybox9-placement-1-drpc        4m43s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
openshift-gitops       cephfs-appset-busybox21-placement-drpc   4m43s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
openshift-gitops       cephfs-appset-busybox25-placement-drpc   4m43s   amagrawa-c2-13apr   amagrawa-c1-13apr   Failover       WaitForUser    Paused                                True
openshift-gitops       cephfs-appset-busybox5-placement-drpc    4m43s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
openshift-gitops       cephfs-appset-busybox6-placement-drpc    4m43s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
openshift-gitops       cephfs-appset-busybox8-placement-drpc    4m43s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
openshift-gitops       rbd-appset-busybox1-placement-drpc       4m43s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
openshift-gitops       rbd-appset-busybox2-placement-drpc       4m43s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
openshift-gitops       rbd-appset-busybox22-placement-drpc      4m43s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
openshift-gitops       rbd-appset-busybox26-placement-drpc      4m42s   amagrawa-c2-13apr   amagrawa-c1-13apr   Failover       WaitForUser    Paused                                True
openshift-gitops       rbd-appset-busybox3-placement-drpc       4m42s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
openshift-gitops       rbd-appset-busybox4-placement-drpc       4m42s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
*****************************************
Sat Apr 20 12:50:25 UTC 2024
NAMESPACE              NAME                                     AGE     PREFERREDCLUSTER    FAILOVERCLUSTER     DESIREDSTATE   CURRENTSTATE   PROGRESSION   START TIME   DURATION   PEER READY
busybox-workloads-10   rbd-sub-busybox10-placement-1-drpc       4m51s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
busybox-workloads-11   rbd-sub-busybox11-placement-1-drpc       4m51s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
busybox-workloads-12   rbd-sub-busybox12-placement-1-drpc       4m50s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
busybox-workloads-13   cephfs-sub-busybox13-placement-1-drpc    4m50s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
busybox-workloads-14   cephfs-sub-busybox14-placement-1-drpc    4m49s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
busybox-workloads-15   cephfs-sub-busybox15-placement-1-drpc    4m50s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
busybox-workloads-16   cephfs-sub-busybox16-placement-1-drpc    4m51s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
busybox-workloads-23   cephfs-sub-busybox23-placement-1-drpc    4m49s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
busybox-workloads-24   rbd-sub-busybox24-placement-1-drpc       4m49s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
busybox-workloads-27   cephfs-sub-busybox27-placement-1-drpc    4m50s   amagrawa-c2-13apr   amagrawa-c1-13apr   Failover       WaitForUser    Paused                                True
busybox-workloads-28   rbd-sub-busybox28-placement-1-drpc       4m50s   amagrawa-c2-13apr   amagrawa-c1-13apr   Failover       WaitForUser    Paused                                True
busybox-workloads-9    rbd-sub-busybox9-placement-1-drpc        4m50s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
openshift-gitops       cephfs-appset-busybox21-placement-drpc   4m50s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
openshift-gitops       cephfs-appset-busybox25-placement-drpc   4m50s   amagrawa-c2-13apr   amagrawa-c1-13apr   Failover       WaitForUser    Paused                                True
openshift-gitops       cephfs-appset-busybox5-placement-drpc    4m50s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
openshift-gitops       cephfs-appset-busybox6-placement-drpc    4m50s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
openshift-gitops       cephfs-appset-busybox8-placement-drpc    4m50s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
openshift-gitops       rbd-appset-busybox1-placement-drpc       4m50s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
openshift-gitops       rbd-appset-busybox2-placement-drpc       4m50s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
openshift-gitops       rbd-appset-busybox22-placement-drpc      4m50s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
openshift-gitops       rbd-appset-busybox26-placement-drpc      4m49s   amagrawa-c2-13apr   amagrawa-c1-13apr   Failover       WaitForUser    Paused                                True
openshift-gitops       rbd-appset-busybox3-placement-drpc       4m49s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
openshift-gitops       rbd-appset-busybox4-placement-drpc       4m49s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
*****************************************




..
..
..





..
..
..



*****************************************
Sat Apr 20 12:55:02 UTC 2024
NAMESPACE              NAME                                     AGE     PREFERREDCLUSTER    FAILOVERCLUSTER     DESIREDSTATE   CURRENTSTATE   PROGRESSION   START TIME   DURATION   PEER READY
busybox-workloads-10   rbd-sub-busybox10-placement-1-drpc       9m28s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate       WaitForUser    Paused                                True
busybox-workloads-11   rbd-sub-busybox11-placement-1-drpc       9m28s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate       WaitForUser    Paused                                True
busybox-workloads-12   rbd-sub-busybox12-placement-1-drpc       9m27s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate       WaitForUser    Paused                                True
busybox-workloads-13   cephfs-sub-busybox13-placement-1-drpc    9m27s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
busybox-workloads-14   cephfs-sub-busybox14-placement-1-drpc    9m26s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
busybox-workloads-15   cephfs-sub-busybox15-placement-1-drpc    9m27s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
busybox-workloads-16   cephfs-sub-busybox16-placement-1-drpc    9m28s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
busybox-workloads-23   cephfs-sub-busybox23-placement-1-drpc    9m26s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
busybox-workloads-24   rbd-sub-busybox24-placement-1-drpc       9m26s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
busybox-workloads-27   cephfs-sub-busybox27-placement-1-drpc    9m27s   amagrawa-c2-13apr   amagrawa-c1-13apr   Failover       WaitForUser    Paused                                True
busybox-workloads-28   rbd-sub-busybox28-placement-1-drpc       9m27s   amagrawa-c2-13apr   amagrawa-c1-13apr   Failover       WaitForUser    Paused                                True
busybox-workloads-9    rbd-sub-busybox9-placement-1-drpc        9m27s   amagrawa-c1-13apr   amagrawa-c2-13apr   Failover       WaitForUser    Paused                                True
openshift-gitops       cephfs-appset-busybox21-placement-drpc   9m27s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
openshift-gitops       cephfs-appset-busybox25-placement-drpc   9m27s   amagrawa-c2-13apr   amagrawa-c1-13apr   Failover       WaitForUser    Paused                                True
openshift-gitops       cephfs-appset-busybox5-placement-drpc    9m27s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
openshift-gitops       cephfs-appset-busybox6-placement-drpc    9m27s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
openshift-gitops       cephfs-appset-busybox8-placement-drpc    9m27s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
openshift-gitops       rbd-appset-busybox1-placement-drpc       9m27s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
openshift-gitops       rbd-appset-busybox2-placement-drpc       9m27s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
openshift-gitops       rbd-appset-busybox22-placement-drpc      9m27s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
openshift-gitops       rbd-appset-busybox26-placement-drpc      9m26s   amagrawa-c2-13apr   amagrawa-c1-13apr   Failover       WaitForUser    Paused                                True
openshift-gitops       rbd-appset-busybox3-placement-drpc       9m26s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
openshift-gitops       rbd-appset-busybox4-placement-drpc       9m26s   amagrawa-c1-13apr   amagrawa-c2-13apr   Relocate                                                            True
*****************************************


Even after more than 5 minutes, CURRENTSTATE and PROGRESSION is empty for most of the workloads however they can still be failedover via UI to the secondary managed cluster C2 (so it doesn't impact the functionality).



Expected results: Improve time taken to retain/rebuild CURRENTSTATE and PROGRESSION status for inaccessible workloads post hub recovery


Additional info:


Note You need to log in before you can comment on or make changes to this bug.