This seems like an RFE. so this is better handled in the JIRA by creating an issue https://issues.redhat.com/secure/RapidBoard.jspa?rapidView=5496&view=detail
Abhinav Dahiya, hello! I don't think this fits RFE, as openstack API is available with only OS_* variables for getting things like list of flavours available & list of floating IPs provisioned at the moment (see that openshift-install create install-config does indeed display proper list of flavours, proper list of flotaing IPs, networks, etc - stuff that isn't available to openstack clients unless they use authentication). So due to that & to the fact that it fails obscurely without actually logging anything useful, I think this is a bug.
While I agree the installer could fail more explicitly when there is no clouds.yaml, this requirement is actually very well documented: https://github.com/openshift/installer/tree/master/docs/user/openstack#openstack-credentials The installer currently requires a valid clouds.yaml file because this is what the cloud provider expects for the OpenShift component to talk to OpenStack. We do have an epic for pre-flight validations (https://issues.redhat.com/browse/OSASINFRA-1183) that should cover this.
Martin, >because this is what the cloud provider expects for the OpenShift component to talk to OpenStack. There are differences between behaviours when the clouds.yaml is incorrect or missing and this. Incorrect/missing clouds.yaml with NO os_* defined: [root@host-10-0-140-249 new-tenant]# ../openshift-install create cluster FATAL failed to fetch Terraform Variables: failed to load asset "Install Config": invalid "install-config.yaml" file: [platform.openstack.externalNetwork: Internal error: could not retrieve valid networks, platform.openstack.computeFlavor: Internal error: could not retrieve valid flavors, platform.openstack.trunkSupport: Internal error: could not retrieve networking extension aliases, platform.openstack.octaviaSupport: Internal error: could not retrieve service catalog] Incorrect/missing clouds.yaml with OS_* defined and proper: [root@host-10-0-140-249 new-tenant]# ../openshift-install create cluster FATAL failed to fetch Terraform Variables: failed to load asset "Install Config": invalid "install-config.yaml" file: platform.openstack.octaviaSupport: Internal error: could not retrieve service catalog Correct clouds.yaml: [root@host-10-0-140-249 new-tenant]# ../openshift-install create cluster <no error here, proceding to deploy cluster> There are 2 points here. 1) There is no clouds.yaml validation. So in 1st 2 cases I'd expect openshift-install to say "Incorrect password or username", for example, or "can not connect to API endpoint". 2) The issue is masquarded by differences between option A (completely missing auth) and option B (valid auth supplied via OS_* env variables). Because behaviour differs, and in second case, openshift-install claims that it can not load some "services" in case of openshift-install create install-config (prior to services, it asks about flavours, floating IPs and other things so evidently openstack API *works*), I've assumed that this is not an issue of openshift-installer but instead an issue with our new openstack tenant, which lead to several people loosing several workdays on trying to debug the situation. To combat 2) at least logging should be improved. I also still think that this should be "high", as we (as in, MW QE teams) are expected to deploy our own clusters every OCP release, and we stumble on this issues and loose a whole lot of time trying to fix them.
We currently don't support environment variables for setting the configuration. However, we want to be clearer about the installer behaviour in that respect. The fix here will be about making sure there is consistent behaviour and documentation. A change in this behaviour qualify as RFE; please file in Jira.
Considering the priority assigned to this bug and our team capacity, we are deferring this bug to an upcoming sprint. Please let us know if there are reasons for us to reprioritize.
Deferring to an upcoming sprint.
Verified with 4.7.0-0.nightly-2021-01-18-214951. # When no coulds existed, then it will raise a message for user. $ ./openshift-install-4.7 create cluster --dir /tmp/tmp.xxIXordN3R FATAL failed to fetch Metadata: failed to load asset "Install Config": failed to create a network client: unable to load clouds.yaml: no clouds.yaml file found
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633
For the record, linked bugs: Bug 1876815 - Installer uses the environment variable OS_CLOUD for manifest generation despite explicit prompt Bug 2015837 - OS_CLOUD overwrites install-config's platform.openstack.cloud