Bug 2063596

Summary:	claim clusters from clusterpool throws errors
Product:	Red Hat Advanced Cluster Management for Kubernetes	Reporter:	Hui Chen <huichen>
Component:	Console	Assignee:	Kevin Cormier <kcormier>
Status:	CLOSED ERRATA	QA Contact:	Xiang Yin <xiyin>
Severity:	high	Docs Contact:	Christopher Dawson <cdawson>
Priority:	high
Version:	rhacm-2.5	CC:	daliu, dho, dhuynh, efried, huichen, kcormier, yuhe
Target Milestone:	---	Keywords:	Reopened
Target Release:	rhacm-2.5	Flags:	huichen: qe_test_coverage+ bot-tracker-sync: rhacm-2.5+ kcormier: needinfo- efried: needinfo- kcormier: needinfo-
Hardware:	All
OS:	All
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2022-06-09 02:09:30 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Comment 1 daliu 2022-03-14 07:32:32 UTC

@efried 

It looks like the clusterpool already assigned one clusterdeployment, and the clusterclaim status and clusterdeployment status changed to running after a few minites.

But the clusterclaim status before running is not right, could you help to take a look?

Comment 2 Eric Fried 2022-03-14 14:12:10 UTC

The timeout is coming from the ACM side, not Hive. The ClusterDeployments aren't ready to be claimed yet, so the claim stays pending. This is working as designed from the hive side. More...

We recently changed the claim logic to only fulfill claims with *running* clusters [1] to mitigate problems with clusters failing to resume after they were claimed. I discussed this with @kcormier and @gbuchana last week, recommending that ACM take this into account, and take advantage of the other knobs we've put into place to improve the UX in this area, including:
- ClusterPool.Spec.RunningCount to maintain a subset of clusters Running so they can be claimed immediately
- ClusterPool.Spec.HibernationConfig.ResumeTimeout to make sure clusters that get stuck resuming are thrown out and replaced

[1] https://issues.redhat.com/browse/HIVE-1679

Comment 3 daliu 2022-03-15 02:23:45 UTC

Thanks @efried 

@kcormier Please help to enhance UI logic here...

Comment 4 bot-tracker-sync 2022-03-21 22:21:44 UTC

G2Bsync 1073963158 comment 
 KevinFCormier Mon, 21 Mar 2022 14:24:24 UTC 
 G2Bsync This has been addressed through UI logic updates.

Comment 5 Hui Chen 2022-04-06 14:43:36 UTC

I think we can close this issue now since the UI logic was changed already.

Comment 9 errata-xmlrpc 2022-06-09 02:09:30 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat Advanced Cluster Management 2.5 security updates, images, and bug fixes), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:4956