Bug 2224325

Summary: [MDR] Not able to relocate STS based applications
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Parikshith <pbyregow>
Component: odf-drAssignee: Shyamsundar <srangana>
odf-dr sub component: ramen QA Contact: avdhoot <asagare>
Status: ON_QA --- Docs Contact:
Severity: high    
Priority: high CC: hnallurv, muagarwa, odf-bz-bot, srangana
Version: 4.13Flags: hnallurv: needinfo? (srangana)
Target Milestone: ---   
Target Release: ODF 4.14.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 4.14.0-96 Doc Type: Known Issue
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Parikshith 2023-07-20 12:23:37 UTC
Description of problem (please be detailed as possible and provide log
snippests):

Facing issue while relocating logwriter(Statefulset) app from c2 to c1 managed cluster on MDR 4.13 setup. I applied the workaround to manually delete the terminating logwriter PVCs after initiating relocate, as mentioned here: https://bugzilla.redhat.com/show_bug.cgi?id=2118270#c27. But, PVCs are still stuck in terminating state(oc delete pvc command hangs).

oc get drpc -n logwritter-sub-1 -owide
NAME                                AGE    PREFERREDCLUSTER   FAILOVERCLUSTER   DESIREDSTATE   CURRENTSTATE   PROGRESSION                 START TIME             DURATION   PEER READY
logwritter-sub-1-placement-1-drpc   178m   pbyregow-clu1      pbyregow-clu2     Relocate       Relocating     WaitingForResourceRestore   2023-07-20T09:05:17Z


oc get pvc -n logwritter-sub-1
NAME                            STATUS        VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS                           AGE
logwriter-cephfs-many           Terminating   pvc-04a6351c-7d9e-4849-9340-0145de801349   10Gi       RWX            ocs-external-storagecluster-cephfs     146m
logwriter-rbd-logwriter-rbd-0   Terminating   pvc-13323f74-5b0e-415e-b11a-6b1d42cbdf45   10Gi       RWO            ocs-external-storagecluster-ceph-rbd   146m
logwriter-rbd-logwriter-rbd-1   Terminating   pvc-a3bd7e24-b21e-42dc-9e40-236818c6ed7f   10Gi       RWO            ocs-external-storagecluster-ceph-rbd   146m
logwriter-rbd-logwriter-rbd-2   Terminating   pvc-69ba12c8-5292-4524-849c-e3a32715d905   10Gi       RWO            ocs-external-storagecluster-ceph-rbd   146m

oc get vrg logwritter-sub-1-placement-1-drpc -n logwritter-sub-1
NAME                                DESIREDSTATE   CURRENTSTATE
logwritter-sub-1-placement-1-drpc   secondary      Secondary

Version of all relevant components (if applicable):
ODF/MCO: 4.13.0
ACM: 2.8

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
no

Is there any workaround available to the best of your knowledge?
will be updated

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
3

Can this issue reproducible?


Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:
yes

Steps to Reproduce:
1. Configure MDR cluster
2. Create stateful set subscription/application based app
3. Failover the app from c1 to c2
4. Initiate relocate and apply the the known WA https://bugzilla.redhat.com/show_bug.cgi?id=2118270#c27 


Actual results:
STS application gets stuck in Relocating state

Expected results:
STS application should be relocated

Additional info:

Comment 5 Shyamsundar 2023-07-20 12:36:32 UTC
WIP upstream PR: https://github.com/RamenDR/ramen/pull/995