2103653 – managed cluster is in "unknown" state for 120 mins after OADP restore

Bug 2103653 - managed cluster is in "unknown" state for 120 mins after OADP restore

Summary: managed cluster is in "unknown" state for 120 mins after OADP restore

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Advanced Cluster Management for Kubernetes
Classification:	Red Hat
Component:	Cluster Lifecycle
Sub Component:
Version:	rhacm-2.5
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	rhacm-2.5.2
Assignee:	Le Yang
QA Contact:	Hui Chen
Docs Contact:	Christopher Dawson
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2022-07-04 11:33 UTC by Alexander Daimon
Modified:	2022-09-13 20:06 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2022-09-13 20:06:28 UTC
Target Upstream Version:
Embargoed:
Flags:	bot-tracker-sync: rhacm-2.5.z+

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	stolostron backlog issues 24017	0	None	None	None	2022-07-04 18:23:49 UTC
Red Hat Product Errata	RHSA-2022:6507	0	None	None	None	2022-09-13 20:06:41 UTC

Description Alexander Daimon 2022-07-04 11:33:05 UTC

Description of the problem: A restored managed cluster remains in "unknown" state for 120 min after a successful Hub cluster recovery using OADP

Release version:ACM 2.5.0
OCP version: 4.10.18

1. Deploy Hub cluster.
2. Deploy Managed cluster (Hive API)
3. Install OADP operator, as described in the docs, not manually
4. Create a DataProtectionApplication and BackupStorageLocation resources
5. Create a Schedule resource and observe Backup objects are being created periodically
6. Destroy Hub cluster
7. Deploy a new Hub cluster, providing it a same base URL (DNS)
8. Install OADP operator, as described in the docs, not manually
9. Create a DataProtectionApplication and BackupStorageLocation
10. Create a Restore Object and observe a restore operation starts

Actual results:
1. The restored managed cluster - is in "ready" state
2. After 6 minutes - the managed cluster has its state changed to "unknown"
3. After additional 114 minutes - the managed cluster has its state changed again to "ready"

Expected results:
1. The restored managed cluster - remains is in "ready" after the restore without falling back to "unknown" state

Additional info: 
"The same URL" - means that the Source Hub Cluster prior to its tear down has the same URL as the Target Hub Cluster, where all the recovery will happen. A DR scenario.
All managed clusters are installed using Hive API.

Comment 1 bot-tracker-sync 2022-09-06 23:27:56 UTC

G2Bsync 1238616821 comment 
 thuyn-581 Tue, 06 Sep 2022 20:27:32 UTC 
 G2BSync -
Validated on 2.5.2-FC3.

Comment 6 errata-xmlrpc 2022-09-13 20:06:28 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Critical: Red Hat Advanced Cluster Management 2.5.2 security fixes and bug fixes), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:6507

Note You need to log in before you can comment on or make changes to this bug.