Bug 2211883 - [MDR] After zone failure(c1+h1 cluster) and hub recovery, c1 apps peer ready status is in "Unknown" state
Summary: [MDR] After zone failure(c1+h1 cluster) and hub recovery, c1 apps peer ready ...
Keywords:
Status: ASSIGNED
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: odf-dr
Version: 4.13
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ---
: ---
Assignee: Benamar Mekhissi
QA Contact: krishnaram Karthick
URL:
Whiteboard:
Depends On:
Blocks: 2154341
TreeView+ depends on / blocked
 
Reported: 2023-06-02 13:19 UTC by Parikshith
Modified: 2023-08-09 17:00 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
After the zone failure and hub recovery, occasionally, the peer ready status of the subscription and appset applications in their disaster recovery placement control (DRPC) is shown as `Unknown`. This is a cosmetic issue and does not impact the regular functionality of Ramen and is limited to the visual appearance of the DRPC output when viewed using the `oc` command. Workaround: Use the YAML output to know the correct status: ---- $ oc get drpc -o yaml ----
Clone Of:
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)

Description Parikshith 2023-06-02 13:19:44 UTC
Description of problem (please be detailed as possible and provide log
snippests):
After shutting down zone hosting c1 and h1 cluster and performing hub recovery to h2. Peer Ready status of subscription and appset apps in their drpc is in "Unknown" state.

NAMESPACE          NAME                             AGE   PREFERREDCLUSTER   FAILOVERCLUSTER   DESIREDSTATE   CURRENTSTATE   PROGRESSION      START TIME             DURATION       PEER READY
b-sub-1            b-sub-1-placement-1-drpc         59m   pbyregow-c1        pbyregow-c2       Relocate                                                                             Unknown
b-sub-2            b-sub-2-placement-1-drpc         59m   pbyregow-c1        pbyregow-c2       Relocate                                                                             Unknown
cronjob-sub-1      cronjob-sub-1-placement-1-drpc   59m   pbyregow-c1        pbyregow-c2       Relocate                                                                             Unknown
openshift-gitops   b-app-1-placement-drpc           59m   pbyregow-c1        pbyregow-c2       Relocate                                                                             Unknown
openshift-gitops   b-app-2-placement-drpc           59m   pbyregow-c1        pbyregow-c2       Relocate                                                                             Unknown
openshift-gitops   cronjob-app-1-placement-drpc     59m   pbyregow-c1        pbyregow-c2       Relocate                                                                             Unknown
openshift-gitops   job-app-1-placement-drpc         59m   pbyregow-c1        pbyregow-c2       Relocate                                                                             Unknown

Version of all relevant components (if applicable):
ocp: 4.13.0-0.nightly-2023-05-30-074322
odf/mco: 4.13.0-207
ACM: 2.7.4

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
no

Is there any workaround available to the best of your knowledge?
no

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
4

Can this issue reproducible?


Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Create 4 OCP clusters such that 2 hubs and 2 managed clusters. And one stretched RHCS cluster.
   Deploy cluster in such a way that
	zone a: arbiter ceph node
	zone b: c1, h1, 3 ceph nodes
	zone c: c2, h2, 3 ceph nodes
2. Configure MDR and deploy applications(appset and subscription) on each managed clusters. Apply drpolicy to all apps.
3. Initiate a backup process, such that the active and passive hubs are in sync
4. Made zone b down, ie c1, h1 and 3 ceph nodes
5. Initiate the restore process on h2
6. Restore succeeded in new-hub, dr policy on h2 in validated state
7. Check drpc of c1 apps


Actual results:
Peer ready status of c1 apps is Unknown

Expected results:
Peer ready status c1 apps should be True 

Additional info:

Comment 4 Mudit Agarwal 2023-06-05 11:47:49 UTC
Not a 4.13 blocker, moving it out

Comment 5 Benamar Mekhissi 2023-06-12 01:31:44 UTC
PR is here: https://github.com/RamenDR/ramen/pull/920


Note You need to log in before you can comment on or make changes to this bug.