Description of problem: All Azure jobs in CI and periodics appear to be failing since 5/23. Run template e2e-azure - e2e-azure container setup expand_less 1s Installing from release registry.svc.ci.openshift.org/ci-op-f0y2nnjm/release@sha256:cb584c61ad66cf4643c4cb37bf7efac69a2ccc4d93db971c74a2d8e41f71112c Azure region: centralus level=info msg="Credentials loaded from file \"/etc/openshift-installer/osServicePrincipal.json\"" level=fatal msg="failed to fetch Master Machines: failed to load asset \"Install Config\": platform.azure.region: Invalid value: \"centralus\": failed to retrieve available regions" https://search.apps.build01.ci.devcluster.openshift.com/chart?search=failed+to+retrieve+available+regions&maxAge=168h&context=1&type=bug%2Bjunit&name=&maxMatches=5&maxBytes=20971520&groupBy=job https://testgrid.k8s.io/redhat-openshift-ocp-release-4.5-informing#release-openshift-ocp-installer-e2e-azure-4.5 Version-Release number of the following components: rpm -q openshift-ansible rpm -q ansible ansible --version How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Please include the entire output from the last TASK line through the end of output if an error is generated Expected results: Additional info: Please attach logs from ansible-playbook with the -vvv flag
*** Bug 1840852 has been marked as a duplicate of this bug. ***
The issue ended up being expired credentials and it has been addressed. I'm changing the target release and priority of this bug and will use it to track progress on updating the error message to actually print out the problem.
Since this is low priority, I didn't get to it this sprint. I'll update when there is time to work on this.
Verified with: ./openshift-install 4.6.0-0.nightly-2020-08-01-172303 built from commit 7a5af8cddbd04a7c6af6006696141d8afe2fb027 release image registry.svc.ci.openshift.org/ocp/release@sha256:6d4b31af9959b02b8589bb4b804812c436f38a9726827fa5e5a0ea66d6d79cf4 Reproduction steps: 1) Generate a working install-config.yaml from a current Service Principal 2) Configure your osServicePrincipal.json to use an EXPIRED Service Principal 3) Try to install a cluster using the install-config.yaml generated in step 1 ~~~ ./openshift-install create cluster --dir ./install_config_folder INFO Credentials loaded from file "/home/openshift-qe/.azure/osServicePrincipal.json" FATAL failed to fetch Metadata: failed to load asset "Install Config": platform.azure.region: Internal error: failed to retrieve available regions: failed to list locations: azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com/subscriptions/$SUBID/locations?api-version=2019-06-01: StatusCode=401 -- Original Error: adal: Refresh request failed. Status Code = '401'. Response body: {"error":"invalid_client","error_description":"AADSTS7000222: The provided client secret keys are expired. Visit the Azure Portal to create new keys for your app, or consider using certificate credentials for added security: https://docs.microsoft.com/azure/active-directory/develop/active-directory-certificate-credentials\r\nTrace ID: 3416e35d-5586-4646-ae28-d8a5c8ee3e00\r\nCorrelation ID: 4b8affac-99de-4536-8505-0f269b69d15f\r\nTimestamp: 2020-08-04 20:50:05Z","error_codes":[7000222],"timestamp":"2020-08-04 20:50:05Z","trace_id":"3416e35d-5586-4646-ae28-d8a5c8ee3e00","correlation_id":"4b8affac-99de-4536-8505-0f269b69d15f","error_uri":"https://login.microsoftonline.com/error?code=7000222"} ~~~
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196