On May 22, OpenStack CI jobs started to fail installations: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.8-informing#release-openshift-ocp-installer-e2e-openstack-serial-4.8 https://testgrid.k8s.io/redhat-openshift-ocp-release-4.8-informing#release-openshift-ocp-installer-e2e-openstack-4.8 https://search.ci.openshift.org/?search=failed+to+lookup+masters%3A+resource+not+found+&maxAge=48h&context=0&type=build-log&name=&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job Example jobs: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-openstack-4.8/1396768057287774208 https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-openstack-4.8/1396647260615348224 INFO[2021-05-24T02:46:06Z] level=error msg=Attempted to gather ClusterOperator status after installation failure: listing ClusterOperator objects: Get "https://api.nhmqbc00-d8ea2.shiftstack.devcluster.openshift.com:6443/apis/config.openshift.io/v1/clusteroperators": dial tcp 38.102.83.104:6443: connect: no route to host INFO[2021-05-24T02:46:06Z] level=debug msg=Fetching Bootstrap SSH Key Pair... INFO[2021-05-24T02:46:06Z] level=debug msg=Loading Bootstrap SSH Key Pair... INFO[2021-05-24T02:46:06Z] level=debug msg=Using Bootstrap SSH Key Pair loaded from state file INFO[2021-05-24T02:46:06Z] level=debug msg=Reusing previously-fetched Bootstrap SSH Key Pair INFO[2021-05-24T02:46:06Z] level=debug msg=Fetching Install Config... INFO[2021-05-24T02:46:06Z] level=debug msg= Loading Platform... INFO[2021-05-24T02:46:06Z] level=debug msg= Loading Pull Secret... INFO[2021-05-24T02:46:06Z] level=debug msg= Loading Platform... INFO[2021-05-24T02:46:06Z] level=debug msg=Using Install Config loaded from state file INFO[2021-05-24T02:46:06Z] level=debug msg=Reusing previously-fetched Install Config INFO[2021-05-24T02:46:06Z] level=error msg=failed to lookup masters: resource not found INFO[2021-05-24T02:46:06Z] level=info msg=Pulling debug logs from the bootstrap machine INFO[2021-05-24T02:46:06Z] level=debug msg=Added /tmp/bootstrap-ssh437206137 to installer's internal agent INFO[2021-05-24T02:46:06Z] level=debug msg=Added /tmp/.ssh/id_rsa to installer's internal agent INFO[2021-05-24T02:46:06Z] level=error msg=Attempted to gather debug logs after installation failure: failed to create SSH client: dial tcp 38.102.83.11:22: connect: connection timed out INFO[2021-05-24T02:46:06Z] level=error msg=Bootstrap failed to complete: Get "https://api.nhmqbc00-d8ea2.shiftstack.devcluster.openshift.com:6443/version?timeout=32s": dial tcp 38.102.83.104:6443: connect: no route to host INFO[2021-05-24T02:46:06Z] level=error msg=Failed waiting for Kubernetes API. This error usually happens when there is a problem on the bootstrap host that prevents creating a temporary control plane. INFO[2021-05-24T02:46:06Z] level=error msg=Attempted to analyze the debug logs after installation failure: could not open the gather bundle: open : no such file or directory INFO[2021-05-24T02:46:06Z] level=fatal msg=Bootstrap failed to complete
VMs can no longer reach the metadata service. Booting a cirros VM, it shows in the logs: Starting network: udhcpc: started, v1.29.3 udhcpc: sending discover udhcpc: sending select for 172.16.0.125 udhcpc: lease of 172.16.0.125 obtained, lease time 86400 route: SIOCADDRT: File exists WARN: failed: route add -net "0.0.0.0/0" gw "172.16.0.1" OK checking http://169.254.169.254/2009-04-04/instance-id failed 1/20: up 1.59. request failed failed 2/20: up 14.74. request failed failed 3/20: up 27.83. request failed [snip] failed 19/20: up 237.21. request failed failed 20/20: up 250.30. request failed failed to read iid from metadata. tried 20 failed to get instance-id of datasource I'm unable to SSH to the VM, however I can connect to it via the noVNC client from the web interface. Request to the nova metadata shows it's returning a 500 error. We've opened a support ticket with our cloud provider.
Vexxhost has fixed their issue with the metadata service and now jobs are passing again. https://prow.ci.openshift.org/job-history/gs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-openstack-4.8
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438