+++ This bug was initially created as a clone of Bug #1976769 +++ Description of problem: Link to the cluster - https://cloud.redhat.com/openshift/assisted-installer/clusters/28c7e3d1-90ae-47bc-9c59-ad9dc1260160 The assisted installer controller is failing to resolve the cluster API in case of SNO installation. The only DNS entries configured in dnsmasq are: 1. api-int... 2. *.apps... This cause 2 different issues: 1. the cotroller fail to apply costum manifests (OLM manifests): time="2021-06-27T20:30:03Z" level=error msg="Failed to apply manifest file." error="failed executing bash [-c oc --kubeconfig=/tmp/controller-custom-manifests-114984170/kubeconfig-noingress apply -f /tmp/controller-custom-manifests-114984170/custom_manifests.yaml], Error exit status 1, LastOutput \"... -114984170/custom_manifests.yaml\": Get \"https://api.pc-openshift.hokd.pro-crafting.com:6443/api?timeout=32s\": dial tcp: lookup api.pc-openshift.hokd.pro-crafting.com on 136.243.34.170:53: no such host\"" This issue cause an installation failure 2. The cotroller fail to run the must-gather: time="2021-06-27T20:46:50Z" level=info msg="failed executing bash [-c cd /tmp/controller-must-gather-logs-680691528 && oc --kubeconfig=/tmp/controller-must-gather-logs-680691528/kubeconfig-noingress adm must-gather --image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8f4cc4b4c95cfebdb701f8de519a0d5ac38111b4f173913fcf61956655072d65] ... (lot's of forbidden errors) Unable to connect to the server: dial tcp: lookup api.pc-openshift.hokd.pro-crafting.com on 136.243.34.170:53: no such host\"" Version-Release number of selected component (if applicable): How reproducible: 100% if you enable LSO or CNV when installing SNO Steps to Reproduce: 1. Install SNO from here https://cloud.redhat.com/openshift/assisted-installer/clusters 2. Enable CNV and LSO 3. Actual results: While the CVO sttatus is avilable and the OCP installation completed successfully the failure to apply the OLM manifests led to a timeout that failed the installation Events: /27/2021, 11:38:09 PM error Host static.170.34.243.136.clients.your-server.de: updated status from "installed" to "error" (Host is part of a cluster that failed to install) 6/27/2021, 11:38:03 PM critical Failed installing cluster pc-openshift. Reason: timed out 6/27/2021, 11:38:03 PM Updated status of cluster pc-openshift to error 6/27/2021, 11:24:02 PM Cluster version status: available message: Done applying 4.8.0-rc.0 Expected results: Installation success Additional info: --- Additional comment from lgamliel on 20210629T10:16:01 When did the allowed CNV/LSO on SNO we expose it in the UI on 23/06/2021 When did we deploy the release with this change: https://github.com/openshift/assisted-installer/pull/271 v1.0.21.3 What is our success rate (should be 0) when installing SNO with CNV/LSO since the above release
In /etc/dnsmasq.d/single-node.conf I see: address=/apps.titan...redhat.com/192.168.123.119 address=/api-int.titan...redhat.com/192.168.123.119 address=/api.titan...redhat.com/192.168.123.119 Shouldn't the first line start with a *? For example: address=/*.apps.titan...redhat.com/192.168.123.119
Verified that the exposed routes under *.apps are reachable.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days