Bug 2097511 - [RDR] When sequential failover is performed, primary cluster took ~1hr:18mins to cleanup from the time Nodes were powered ON, health & image_health in mirroring status summary on both clusters remain in Warning and never recover [NEEDINFO]
Summary: [RDR] When sequential failover is performed, primary cluster took ~1hr:18mins...
Keywords:
Status: VERIFIED
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: ceph
Version: 4.10
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Adam Kupczyk
QA Contact: Aman Agrawal
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-06-15 19:50 UTC by Aman Agrawal
Modified: 2023-08-09 16:37 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 2116605 (view as bug list)
Environment:
Last Closed:
Embargoed:
srangana: needinfo-
sheggodu: needinfo? (akupczyk)
sheggodu: needinfo? (akupczyk)


Attachments (Terms of Use)

Comment 6 Benamar Mekhissi 2022-06-16 15:00:35 UTC
@Madhu Rajanna can you please take a look.  During the time C1 was trying to become primary for 100 PVCs, the ForcePromote was failing.  It took 13 minutes for all 100 to be marked as Primary.  That's from ~15T17:50 to ~15T18:03. However, C2, the old primary, took a lot longer to transition to secondary (more than an hour), and then all 100 to get deleted. During that time, all VRs were failing to resync.

Comment 22 Mudit Agarwal 2022-07-12 13:37:42 UTC
Aman, do we have an update on this?

Comment 24 Mudit Agarwal 2022-07-25 07:06:41 UTC
Hi Aman, any news on this? Is this still a blocker (specifically TP blocker)?

Comment 66 Elad 2023-06-19 06:05:45 UTC
Moving to 4.13.z for verification purposes


Note You need to log in before you can comment on or make changes to this bug.