Bug 2120760

Summary: [RDR] Relocate operation takes avg. ~20mins just for the Pods to show up in the ContainerCreating state on the desired cluster
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Aman Agrawal <amagrawa>
Component: odf-drAssignee: rakesh <rgowdege>
odf-dr sub component: ramen QA Contact: krishnaram Karthick <kramdoss>
Status: ASSIGNED --- Docs Contact:
Severity: high    
Priority: unspecified CC: bmekhiss, ebenahar, kramdoss, kseeger, muagarwa, odf-bz-bot, rtalur, sheggodu, srangana
Version: 4.11   
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 4 Elad 2022-11-13 12:26:14 UTC
Proposing as a blocker for 4.12.0 as the current behavior leads to application downtime and very high RTO which isn't acceptable.

Comment 12 Mudit Agarwal 2022-11-22 07:42:16 UTC
Given that it is a relocate operation and not failover, according to engineering this does not qualify as a blocker.
Also, in order to identify the actual bottleneck we need more enhancements which are not possible for 4.12

Shyam,

Do we need to add it as a known issue in 4.12?

Comment 13 Shyamsundar 2022-11-23 12:29:28 UTC
(In reply to Mudit Agarwal from comment #12)
> Given that it is a relocate operation and not failover, according to
> engineering this does not qualify as a blocker.
> Also, in order to identify the actual bottleneck we need more enhancements
> which are not possible for 4.12
> 
> Shyam,
> 
> Do we need to add it as a known issue in 4.12?

I do not think so, as it would depend on the amount of data etc. to transfer. IOW, barring the unavailability of a metric on how much data is left to transfer there is no known issue here.

Requested to add a note to the DR document in the relocate pre-req to check last sync time is closer to current time before relocation, to speed up the process: https://docs.google.com/document/d/1N6IbYv6SGkmAsj6UHGrZWNnUD3qHq3nsPFH4obCBf6M/edit?disco=AAAAkUOiR84