2063596 – claim clusters from clusterpool throws errors

Bug 2063596 - claim clusters from clusterpool throws errors

Summary: claim clusters from clusterpool throws errors

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Advanced Cluster Management for Kubernetes
Classification:	Red Hat
Component:	Console
Sub Component:
Version:	rhacm-2.5
Hardware:	All
OS:	All
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	rhacm-2.5
Assignee:	Kevin Cormier
QA Contact:	Xiang Yin
Docs Contact:	Christopher Dawson
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2022-03-14 01:39 UTC by Hui Chen
Modified:	2022-06-09 02:09 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2022-06-09 02:09:30 UTC
Target Upstream Version:
Embargoed:
Flags:	huichen: qe_test_coverage+ bot-tracker-sync: rhacm-2.5+ kcormier: needinfo- efried: needinfo- kcormier: needinfo-

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	stolostron backlog issues 20668	0	None	None	None	2022-03-15 05:01:40 UTC
Red Hat Product Errata	RHSA-2022:4956	0	None	None	None	2022-06-09 02:09:41 UTC

Comment 1 daliu 2022-03-14 07:32:32 UTC

@efried 

It looks like the clusterpool already assigned one clusterdeployment, and the clusterclaim status and clusterdeployment status changed to running after a few minites.

But the clusterclaim status before running is not right, could you help to take a look?

Comment 2 Eric Fried 2022-03-14 14:12:10 UTC

The timeout is coming from the ACM side, not Hive. The ClusterDeployments aren't ready to be claimed yet, so the claim stays pending. This is working as designed from the hive side. More...

We recently changed the claim logic to only fulfill claims with *running* clusters [1] to mitigate problems with clusters failing to resume after they were claimed. I discussed this with @kcormier and @gbuchana last week, recommending that ACM take this into account, and take advantage of the other knobs we've put into place to improve the UX in this area, including:
- ClusterPool.Spec.RunningCount to maintain a subset of clusters Running so they can be claimed immediately
- ClusterPool.Spec.HibernationConfig.ResumeTimeout to make sure clusters that get stuck resuming are thrown out and replaced

[1] https://issues.redhat.com/browse/HIVE-1679

Comment 3 daliu 2022-03-15 02:23:45 UTC

Thanks @efried 

@kcormier Please help to enhance UI logic here...

Comment 4 bot-tracker-sync 2022-03-21 22:21:44 UTC

G2Bsync 1073963158 comment 
 KevinFCormier Mon, 21 Mar 2022 14:24:24 UTC 
 G2Bsync This has been addressed through UI logic updates.

Comment 5 Hui Chen 2022-04-06 14:43:36 UTC

I think we can close this issue now since the UI logic was changed already.

Comment 9 errata-xmlrpc 2022-06-09 02:09:30 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat Advanced Cluster Management 2.5 security updates, images, and bug fixes), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:4956

Note You need to log in before you can comment on or make changes to this bug.