Description of problem: OCP 4.4 on OSP16 fails with error: INFO Cluster operator monitoring Progressing is True with RollOutInProgress: Rolling out the s tack. ERROR Cluster operator monitoring Degraded is True with UpdatingAlertmanagerFailed: Failed to rollout the stack. Error: running task Updating Alertmanager failed: waiting for Alertmanager Route to become ready failed: waiting for RouteReady of alertmanager-main: no status available for alertmanager-main FATAL failed to initialize the cluster: Working towards 4.4.0-0.nightly-2020-03-01-212047: 99% complete $ oc logs -n openshift-machine-api machine-api-controllers-8d874cb86-z8xp6 -c machine-controller E0303 11:41:28.528641 1 controller.go:279] Failed to check if machine "ocpra-vbmm9-worker-2lbhd" exists: Error checking if instance exists (machine/actuator.go 346): Error getting a new instance service from the machine (machine/actuator.go 467): Failed to authenticate provider client: Get https://openstack.home.lab:13000/: dial tcp: lookup openstack.home.lab on 172.30.0.10:53: no such host Config file DNS section: platform: openstack: cloud: openstack computeFlavor: m1.large externalDNS: ['172.31.8.1','8.8.8.8'] Version-Release number of the following components: OCP 4.4.0-0.nightly-2020-03-01-212047 Based on this doc -> https://coredns.io/plugins/forward/ The default policy is random, meaning that if it picks 8.8.8.8 resolver that won't be able to resolve your cluster hostname We should change it to default to first parameter and only use the second one if the first one fails How reproducible: Everytime Steps to Reproduce: 1. Deploy OCP 4.4 with 2 dns entries where only one of the dns resolved overcloud domain name Actual results: failed Expected results: only use first dns unless first dns is not accessible Additional info:
Verified on 4.5.0-0.nightly-2020-04-29-111042
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.5 image release advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409