Bug 2134395

Summary: Node scaling is stuck in installing state on ACM UI for a cluster deployed using CIM
Product: Red Hat Advanced Cluster Management for Kubernetes Reporter: Mihir Lele <mlele>
Component: Cluster LifecycleAssignee: Le Yang <leyan>
Status: CLOSED DUPLICATE QA Contact: Hui Chen <huichen>
Severity: high Docs Contact:
Priority: high    
Version: rhacm-2.6.zCC: dhuynh, tjelinek
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-10-13 11:32:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
screenshot1 none

Description Mihir Lele 2022-10-13 10:45:35 UTC
Created attachment 1917812 [details]
screenshot1

*Description of the problem:*

I have a customer where customer is trying to add new nodes to an ocp cluster that's successfully deployed using CIM
 - The node was bootstrapped into the ocp cluster and its visible in "oc get nodes".
 - Customer also tried deploying payload on the newly added node and it was successful
 - But the ACM UI cluster page for the openshift cluster is stuck in "installing" state with no progress being made.
- Also the openshift web console for the managed cluster shows an error when trying to view the details of the newly added node.
- Attaching screenshots for some clarity
- Cluster events page on the ACM UI shows no status updates after it says "rebooting".
- I checked the assisted-service logs on the ACM hub and even they do not share any update on the progress.
- Customer had discovered the BM host using manual booting and it was successfully discovered.
- We saw no errors on the UI during discovery or during installation.
- Maybe ACM was waiting for something from the managed cluster and node/managed cluster was not able to send/respond to it?
 

*How reproducible:*  We didnt try to reproduce it yet.

 *Steps to reproduce:*
NA

*Actual results:*
The node installation is stuck in "installing" state

*Expected results:*
The node installation should complete

Comment 2 Tomas Jelinek 2022-10-13 11:32:31 UTC
there are two bugs here:
1: the acm console is showing "progressing" even though the hosts finished installation successfully. It is tracked in https://bugzilla.redhat.com/show_bug.cgi?id=2118545 and is targeted to 2.6.2
2: the crash in OCP console node's screen. Its tracked here https://bugzilla.redhat.com/show_bug.cgi?id=2090993 (for 4.11) and the 4.10 backport here https://issues.redhat.com/browse/OCPBUGS-1696 (4.10 fix not yet merged, but fully acked)

hence, closing this ticket since its already tracked on different places.

*** This bug has been marked as a duplicate of bug 2090993 ***