Bug 1937078 - Trying to create a new cluster on vSphere and no feedback, stuck in "creating"
Summary: Trying to create a new cluster on vSphere and no feedback, stuck in "creating"
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Advanced Cluster Management for Kubernetes
Classification: Red Hat
Component: Cluster Lifecycle
Version: rhacm-2.2
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
: rhacm-2.4.5
Assignee: daliu
QA Contact: Napoco Agbetra
Christopher Dawson
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-03-09 18:38 UTC by Stu Lipshires
Modified: 2022-06-27 17:04 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-06-27 17:04:01 UTC
Target Upstream Version:
Embargoed:
juhsu: rhacm-2.4.z+
juhsu: rhacm-2.5+


Attachments (Terms of Use)
ScreenShot of UI once create button is pressed (254.09 KB, image/png)
2021-03-10 01:21 UTC, Benjamin Schmaus
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github open-cluster-management backlog issues 10327 0 None None None 2021-03-10 14:49:24 UTC
Red Hat Product Errata RHSA-2022:5201 0 None None None 2022-06-27 17:04:31 UTC

Description Stu Lipshires 2021-03-09 18:38:02 UTC
I am trying to create a new cluster in a vSphere environment. I created the vSphere Connection. Then try to create a cluster using the Connection. The RHACM screen just shows "Creating" under status. I suspect there is something wrong with my connection string but I don't see a  "Test Connection" link to test the connection. When I click on the "Creating" status, a little pop-up asks me to "view logs" but nothing happens when I select it. If I select the Cluster to look at detail status, there is a "Cluster creation in progress" message at the top of the screen along with another link to "View Logs". That link does nothing either. 

Given that the "view log" links don't work, where can I find out why the cluster creation is hanging?

Comment 1 Benjamin Schmaus 2021-03-10 01:21:57 UTC
Created attachment 1762130 [details]
ScreenShot of UI once create button is pressed

Comment 2 Benjamin Schmaus 2021-03-10 01:45:58 UTC
I also experience a hang when trying to create a baremetal IPI cluster.  Once I hit create I see the green "Creating cluster..." but it never goes to the point where it I have the option to view logs.  I also can see the namespace for my cluster name is created but no Hive pod ever instantiates so I can view logs.

Comment 3 Benjamin Schmaus 2021-03-10 19:50:10 UTC
If I deploy the cluster.yaml from the cli it deploys without issue - the Hive pod spins up.  I hit a different issue outlined here : https://access.redhat.com/solutions/5709711

Comment 4 Stu Lipshires 2021-03-12 15:36:03 UTC
I had used the ip address of the vcenter server in the connection vs. the name in the vmware cert so this was the issue that was preventing anything that was happening. There is still an issue with not displaying the logs when this error occurs so early in the create cluster process but at least I got past the initial problem. After I corrected this error, I was able to view logs and determine any other configuration issues.

Comment 5 Simon Krenger 2021-03-16 13:54:45 UTC
I hit the same issue today in ACM 2.2 when the vCenter location was specified with "https://" (e.g. "https://my-vcenter.example.com/") instead of just the name ("my-vcenter.example.com").

Comment 6 Stu Lipshires 2021-03-16 14:22:07 UTC
It would be really good to have a "validate connection" button that would check the connection and validate info at the time the connection is created.

Comment 7 Stu Lipshires 2021-03-16 14:22:33 UTC
It would be really good to have a "validate connection" button that would check the connection and validate info at the time the connection is created.

Comment 8 cahl 2021-03-17 17:28:40 UTC
Sorry for the delay in responding.  When you create the VMware cluster and the view logs option appears, it might take a bit of time for the backend hive to start the provision pod (in the namespace you choose on create cluster) and Openshift installer.   So one place you could look to see if there is a problem is the hive controller pod log in the hive namespace.   You can also look for a ClusterDeployment in the namespace you created the cluster in.

For the VMWare provider connection, I assume you are using the same IP or fully qualified host name, userid and password that you use to login to vcenter?

While we are debugging VMware, I have to ask you if you followed the Openshift installer prereqs for VMware? 

The idea for a Test Connection is a good one.  I will see if we already have an item to that for possible future consideration.

Comment 9 cahl 2021-03-22 17:37:29 UTC
I was also wondering if you are doing any editing on the YAML editor or are you just using the UI to fill in the fields?

Comment 10 Stu Lipshires 2021-03-22 17:43:07 UTC
No editing of yaml (I try to do as little as possible of that :-) ) Just the UI. It turned out the issue was that the hostname IP I  put on the form for vcenter name didn't match the name on the cert. Once I used the hostname in the cert, it worked ok. The real issue is that I got no feedback at all as to the issue

Comment 11 Mike Ng 2021-03-25 20:04:40 UTC
G2Bsync 807389298 comment 
 chrisahl Thu, 25 Mar 2021 20:04:10 UTC 
 G2Bsync
I have opened an issue for hive - https://issues.redhat.com/browse/HIVE-1465 so they can surface the error in a way that is easy for RHACM to read and correlate.

Comment 15 dhuynh 2022-02-21 14:53:58 UTC
Re-opened on GH side as we still see validation issue, not stop ship.

Comment 16 bot-tracker-sync 2022-02-21 17:02:22 UTC
G2Bsync 1041081599 comment 
 dtthuynh Wed, 16 Feb 2022 04:07:04 UTC 
 G2Bsync @ldpliu When we set the vCenter server incorrectly, e.g. 
```
vCenter server
    https://acmcicd-vcsa-01.cicd.red-chesterfield.com
```
As the customer did once in the original issue, the cluster creation fails as expected.
<img width="298" alt="image" src="https://user-images.githubusercontent.com/53154476/154194339-70b58f91-b54d-4fc9-9333-14dafd65fe2d.png">

However, if I an incorrect certificate in the vSphere credential, I get the no feedback issue that they reported (cluster stuck in "Creating," no pods or deployments at all so no logs)
```
vCenter root CA certificate
    -----BEGIN CERTIFICATE-----
    MIIESz...
```

This is tested against `ACM 2.4.2 RC2`, do we know what kind of errors will actually surface through? I thought the fix from hive was to validate against this cert and raise it.

Comment 17 bot-tracker-sync 2022-02-21 17:02:24 UTC
G2Bsync 1046991932 comment 
 dtthuynh Mon, 21 Feb 2022 15:24:14 UTC 
 G2Bsync @KevinFCormier @crizzo71 should we move this out of 2.4.2 then?

Comment 19 bot-tracker-sync 2022-06-13 21:45:23 UTC
G2Bsync 1154220073 comment 
 Napoco Mon, 13 Jun 2022 18:02:58 UTC 
 G2Bsync Verified that the cluster creation provides feedback and does not get stuck in the creating phase
Created VMware cluster with bad certificate and got an error as expected:
Platform credentials failed authentication check: invalid certificate '/tmp/rootcacerts1742202225', cannot be used as a trusted CA certificate

Created VMware cluster with wrong server url and got an error as expected:
Platform credentials failed authentication check: Post "https://fake-vcsa-01.cicd.red-chesterfield.com/sdk": dial tcp: lookup fake-vcsa-01.cicd.red-chesterfield.com on 172.30.0.10:53: no such host

Comment 25 errata-xmlrpc 2022-06-27 17:04:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Advanced Cluster Management 2.4.5 security updates and bug fixes), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5201


Note You need to log in before you can comment on or make changes to this bug.