1860233 – After installing RHACM 2.0, managed cluster created through ACM is in Pending Import state not in Ready State

Bug 1860233 - After installing RHACM 2.0, managed cluster created through ACM is in Pending Import state not in Ready State

Summary: After installing RHACM 2.0, managed cluster created through ACM is in Pending...

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Advanced Cluster Management for Kubernetes
Classification:	Red Hat
Component:	Cluster Lifecycle
Sub Component:
Version:	rhacm-2.0
Hardware:	All
OS:	All
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	rhacm-2.0.4
Assignee:	Hao Liu
QA Contact:	magchen@redthat.com
Docs Contact:	Christopher Dawson
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-07-24 04:50 UTC by Neha Chugh
Modified:	2023-09-15 00:34 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-10-21 23:26:44 UTC
Target Upstream Version:
Embargoed:
Flags:	ming: rhacm-2.0.z+

Attachments	(Terms of Use)
hive logs of google managed cluster showing installation as success. (127.08 KB, application/octet-stream) 2020-07-24 04:52 UTC, Neha Chugh	no flags	Details
latest hive logs which is showing successful installation (115.24 KB, application/octet-stream) 2020-08-10 17:04 UTC, Neha Chugh	no flags	Details
Showing pending import state though the installation is success (21.77 KB, image/png) 2020-08-10 17:08 UTC, Neha Chugh	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	open-cluster-management backlog issues 3837	0	None	None	None	2020-09-22 02:46:59 UTC

Description Neha Chugh 2020-07-24 04:50:57 UTC

Description of problem:
After installing RHACM 2.0, managed cluster created through ACM is in Pending Import state not in Ready State

Version-Release number of selected component (if applicable):
2.0

How reproducible:
In my test environment

Steps to Reproduce:

After installing RHACM 2.0 on my bare metal setup, The cluster creation is showing success in the hive logs but the status of managed cluster shows in pending import state rather than Ready State.

The RHACM 2.0 installation is a success, all the pods under open-cluster-management namespace is in running state.


Actual results:

It is showing in Pending import state.

Expected results:

It should show in Ready State.

Additional info:

Attaching my hive logs and test environment details for reference.

Comment 1 Neha Chugh 2020-07-24 04:52:12 UTC

Created attachment 1702296 [details]
hive logs of google managed cluster showing installation as success.

Comment 3 Mike Ng 2020-07-24 20:44:20 UTC

G2Bsync 663726067 comment 
 hanqiuzh Fri, 24 Jul 2020 20:43:12 UTC 
 G2Bsync  Used kubeconfig in `/var/www/html/43/deploy02/auth/kubeconfig`. The cluster given is not a hub, but managedcluster.
Log in rergistration-agent is showing missing managedcluster permission:
```
E0724 20:34:23.229043       1 reflector.go:178] k8s.io/client-go.3/tools/cache/reflector.go:125: Failed to list *v1.ManagedCluster: managedclusters.cluster.open-cluster-management.io "nchugh-gc" is forbidden: User "system:open-cluster-management:nchugh-import:b2vtk" cannot list resource "managedclusters" in API group "cluster.open-cluster-management.io" at the cluster scope
E0724 20:35:00.600143       1 reflector.go:178] k8s.io/client-go.3/tools/cache/reflector.go:125: Failed to list *v1.ManagedCluster: managedclusters.cluster.open-cluster-management.io "nchugh-gc" is forbidden: User "system:open-cluster-management:nchugh-import:b2vtk" cannot list resource "managedclusters" in API group "cluster.open-cluster-management.io" at the cluster scope
```
From the log, it's possible to be an old version of rhacm 2.0 install. @qiujian16  can you please take a look, thanks

Comment 4 Mike Ng 2020-07-27 13:37:30 UTC

G2Bsync 664078477 comment 
 skeeey Mon, 27 Jul 2020 01:54:52 UTC 
 G2Bsync try to connect the cluster, but the token is expired, `error: You must be logged in to the server (Unauthorized)`

Comment 5 Mike Ng 2020-07-27 13:37:31 UTC

G2Bsync 664391776 comment 
 juliana-hsu Mon, 27 Jul 2020 13:19:43 UTC 
 G2Bsync What build snapshot was being used?

Comment 7 Neha Chugh 2020-08-10 17:04:04 UTC

Created attachment 1710991 [details]
latest hive logs which is showing successful installation

Comment 8 Neha Chugh 2020-08-10 17:08:57 UTC

Created attachment 1710994 [details]
Showing pending import state though the installation is success

Comment 9 Neha Chugh 2020-08-12 16:45:26 UTC

Hello Team, 

After checking pods status of GKE cluster that has been created via ACM console, below 2 pods are in CrashLoopBackOff state which could be the reason for pending import status of GKE cluster i.e.

klusterlet-work-agent-674dd7f9f8-4x659	NamespaceNS open-cluster-management-agent	ReplicaSetRSklusterlet-work-agent-674dd7f9f8	NodeN nchugh-jkdk8-w-b-6tc7v.c.openshift-gce-devel-ci.internal	CrashLoopBackOff	ContainersNotReady
	
klusterlet-work-agent-674dd7f9f8-hrd64	NamespaceNSopen-cluster-management-agent	ReplicaSetRSklusterlet-work-agent-674dd7f9f8	NodeNnchugh-jkdk8-w-a-7nzr8.c.openshift-gce-devel-ci.internal	CrashLoopBackOff	ContainersNotReady	

After checking the logs of these pods, below exception has been noticed i.e.

W0812 16:38:34.654493       1 builder.go:94] graceful termination failed, controllers failed with error: stat /spoke/hub-kubeconfig/kubeconfig: no such file or directory
I0812 16:38:34.654459       1 event.go:278] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"open-cluster-management-agent", Name:"work-agent-lock", UID:"9bf57c74-e4a0-4889-a41e-0c35d739aeb6", APIVersion:"v1", ResourceVersion:"846153", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' e3d89866-6a83-4c6f-9c02-1ee661c4c2d3 became leader


Seems like it is unable to find kubeconfig directory due to which it lead to CrashLoopBackOff state, not sure how to rectify this issue.

I understand the issue is specific to test environment but it would be great if you can suggest the solution so to rectify the issue.

Regards,
Neha Chugh

Comment 11 Bradley Scalio 2020-09-03 13:33:56 UTC

Note KB article referencing this issue:  https://access.redhat.com/solutions/5355821

A record needs to be added to the public DNS to point the Hub Cluster's kube-apiserver address (in this exampleapi.ocp.example.com) to the LoadBalancer of the cluster.
Following this the klusterlet-registration pod should be able to fetch the host and continue the cluster import.

Comment 13 Mike Ng 2020-10-15 17:34:41 UTC

G2Bsync 708869131 comment 
 juliana-hsu Thu, 15 Oct 2020 03:15:58 UTC 
 G2Bsync nchugh is this still an issue or can it be closed?

Comment 14 Red Hat Bugzilla 2023-09-15 00:34:37 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days

Note You need to log in before you can comment on or make changes to this bug.