I am trying to create a new cluster in a vSphere environment. I created the vSphere Connection. Then try to create a cluster using the Connection. The RHACM screen just shows "Creating" under status. I suspect there is something wrong with my connection string but I don't see a "Test Connection" link to test the connection. When I click on the "Creating" status, a little pop-up asks me to "view logs" but nothing happens when I select it. If I select the Cluster to look at detail status, there is a "Cluster creation in progress" message at the top of the screen along with another link to "View Logs". That link does nothing either. Given that the "view log" links don't work, where can I find out why the cluster creation is hanging?
Created attachment 1762130 [details] ScreenShot of UI once create button is pressed
I also experience a hang when trying to create a baremetal IPI cluster. Once I hit create I see the green "Creating cluster..." but it never goes to the point where it I have the option to view logs. I also can see the namespace for my cluster name is created but no Hive pod ever instantiates so I can view logs.
If I deploy the cluster.yaml from the cli it deploys without issue - the Hive pod spins up. I hit a different issue outlined here : https://access.redhat.com/solutions/5709711
I had used the ip address of the vcenter server in the connection vs. the name in the vmware cert so this was the issue that was preventing anything that was happening. There is still an issue with not displaying the logs when this error occurs so early in the create cluster process but at least I got past the initial problem. After I corrected this error, I was able to view logs and determine any other configuration issues.
I hit the same issue today in ACM 2.2 when the vCenter location was specified with "https://" (e.g. "https://my-vcenter.example.com/") instead of just the name ("my-vcenter.example.com").
It would be really good to have a "validate connection" button that would check the connection and validate info at the time the connection is created.
Sorry for the delay in responding. When you create the VMware cluster and the view logs option appears, it might take a bit of time for the backend hive to start the provision pod (in the namespace you choose on create cluster) and Openshift installer. So one place you could look to see if there is a problem is the hive controller pod log in the hive namespace. You can also look for a ClusterDeployment in the namespace you created the cluster in. For the VMWare provider connection, I assume you are using the same IP or fully qualified host name, userid and password that you use to login to vcenter? While we are debugging VMware, I have to ask you if you followed the Openshift installer prereqs for VMware? The idea for a Test Connection is a good one. I will see if we already have an item to that for possible future consideration.
I was also wondering if you are doing any editing on the YAML editor or are you just using the UI to fill in the fields?
No editing of yaml (I try to do as little as possible of that :-) ) Just the UI. It turned out the issue was that the hostname IP I put on the form for vcenter name didn't match the name on the cert. Once I used the hostname in the cert, it worked ok. The real issue is that I got no feedback at all as to the issue
G2Bsync 807389298 comment chrisahl Thu, 25 Mar 2021 20:04:10 UTC G2Bsync I have opened an issue for hive - https://issues.redhat.com/browse/HIVE-1465 so they can surface the error in a way that is easy for RHACM to read and correlate.
Re-opened on GH side as we still see validation issue, not stop ship.
G2Bsync 1041081599 comment dtthuynh Wed, 16 Feb 2022 04:07:04 UTC G2Bsync @ldpliu When we set the vCenter server incorrectly, e.g. ``` vCenter server https://acmcicd-vcsa-01.cicd.red-chesterfield.com ``` As the customer did once in the original issue, the cluster creation fails as expected. <img width="298" alt="image" src="https://user-images.githubusercontent.com/53154476/154194339-70b58f91-b54d-4fc9-9333-14dafd65fe2d.png"> However, if I an incorrect certificate in the vSphere credential, I get the no feedback issue that they reported (cluster stuck in "Creating," no pods or deployments at all so no logs) ``` vCenter root CA certificate -----BEGIN CERTIFICATE----- MIIESz... ``` This is tested against `ACM 2.4.2 RC2`, do we know what kind of errors will actually surface through? I thought the fix from hive was to validate against this cert and raise it.
G2Bsync 1046991932 comment dtthuynh Mon, 21 Feb 2022 15:24:14 UTC G2Bsync @KevinFCormier @crizzo71 should we move this out of 2.4.2 then?
G2Bsync 1154220073 comment Napoco Mon, 13 Jun 2022 18:02:58 UTC G2Bsync Verified that the cluster creation provides feedback and does not get stuck in the creating phase Created VMware cluster with bad certificate and got an error as expected: Platform credentials failed authentication check: invalid certificate '/tmp/rootcacerts1742202225', cannot be used as a trusted CA certificate Created VMware cluster with wrong server url and got an error as expected: Platform credentials failed authentication check: Post "https://fake-vcsa-01.cicd.red-chesterfield.com/sdk": dial tcp: lookup fake-vcsa-01.cicd.red-chesterfield.com on 172.30.0.10:53: no such host
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat Advanced Cluster Management 2.4.5 security updates and bug fixes), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5201