Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1977292

Summary: [4.8.0] [SNO] No DNS to cluster API from assisted-installer-controller
Product: OpenShift Container Platform Reporter: Ronnie Lazar <alazar>
Component: assisted-installerAssignee: Igal Tsoiref <itsoiref>
assisted-installer sub component: Installer QA Contact: Udi Kalifon <ukalifon>
Status: CLOSED DUPLICATE Docs Contact:
Severity: medium    
Priority: high CC: alazar, aos-bugs, asegurap, ercohen, itsoiref, lgamliel, mko
Version: 4.8   
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1976769 Environment:
Last Closed: 2021-06-29 15:44:09 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1977352    
Bug Blocks:    

Description Ronnie Lazar 2021-06-29 12:12:51 UTC
+++ This bug was initially created as a clone of Bug #1976769 +++

Description of problem:
Link to the cluster - https://cloud.redhat.com/openshift/assisted-installer/clusters/28c7e3d1-90ae-47bc-9c59-ad9dc1260160

The assisted installer controller is failing to resolve the cluster API in case of SNO installation.
The only DNS entries configured in dnsmasq are:
1.  api-int...
2.  *.apps...

This cause 2 different issues:
1. the cotroller fail to apply costum manifests (OLM manifests):

time="2021-06-27T20:30:03Z" level=error msg="Failed to apply manifest file." error="failed executing bash [-c oc --kubeconfig=/tmp/controller-custom-manifests-114984170/kubeconfig-noingress apply -f /tmp/controller-custom-manifests-114984170/custom_manifests.yaml], Error exit status 1, LastOutput \"... -114984170/custom_manifests.yaml\": Get \"https://api.pc-openshift.hokd.pro-crafting.com:6443/api?timeout=32s\": dial tcp: lookup api.pc-openshift.hokd.pro-crafting.com on 136.243.34.170:53: no such host\""

This issue cause an installation failure

2. The cotroller fail to run the must-gather:
 
time="2021-06-27T20:46:50Z" level=info msg="failed executing bash [-c cd /tmp/controller-must-gather-logs-680691528 && oc --kubeconfig=/tmp/controller-must-gather-logs-680691528/kubeconfig-noingress adm must-gather --image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8f4cc4b4c95cfebdb701f8de519a0d5ac38111b4f173913fcf61956655072d65]
... (lot's of forbidden errors)
Unable to connect to the server: dial tcp: lookup api.pc-openshift.hokd.pro-crafting.com on 136.243.34.170:53: no such host\""

Version-Release number of selected component (if applicable):


How reproducible:
100% if you enable LSO or CNV when installing SNO

Steps to Reproduce:
1. Install SNO from here https://cloud.redhat.com/openshift/assisted-installer/clusters
2. Enable CNV and LSO
3.

Actual results:
While the CVO sttatus is avilable and the OCP installation completed successfully the failure to apply the OLM manifests led to a timeout that failed the installation 

Events:
/27/2021, 11:38:09 PM	
error
 Host static.170.34.243.136.clients.your-server.de: updated status from "installed" to "error" (Host is part of a cluster that failed to install)
6/27/2021, 11:38:03 PM	
critical
 Failed installing cluster pc-openshift. Reason: timed out
6/27/2021, 11:38:03 PM	Updated status of cluster pc-openshift to error
6/27/2021, 11:24:02 PM	Cluster version status: available message: Done applying 4.8.0-rc.0

Expected results:

Installation success

Additional info:

--- Additional comment from lgamliel on 20210629T10:16:01

When did the allowed CNV/LSO on SNO
we expose it in the UI on 23/06/2021

When did we deploy the release with this change:  https://github.com/openshift/assisted-installer/pull/271
v1.0.21.3

What is our success rate (should be 0) when installing SNO with CNV/LSO since the above release

Comment 1 Ronnie Lazar 2021-06-29 15:44:09 UTC

*** This bug has been marked as a duplicate of bug 1970567 ***