|Summary:||ipv6 disconnected sno crd deployment hive reports success status and clusterdeployrmet reporting false|
|Product:||OpenShift Container Platform||Reporter:||bjacot|
|Component:||assisted-installer||Assignee:||Daniel Erez <derez>|
|assisted-installer sub component:||Deployment Operator||QA Contact:||bjacot|
|Status:||VERIFIED ---||Docs Contact:|
|Priority:||urgent||CC:||alazar, aos-bugs, keyoung, mfilanov, sasha|
|Fixed In Version:||OCP-Metal-v188.8.131.52||Doc Type:||If docs needed, set a value|
|Doc Text:||Story Points:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Cloudforms Team:||---||Target Upstream Version:|
Description bjacot 2021-05-04 18:26:13 UTC
I have completed an ipv6 disconnected sno deployment via CRD's. I am not doing ztp. I am manually mounting and booting the iso. I am seeing that the sno deployment succeeded in hive but cluster deployment status is still showing false. Also the kubeconfig of the sno deployment is not showing up in secrets -n assisted-installer. [kni@provisionhost-0-0 ~]$ oc get clusterdeployments.hive.openshift.io -n assisted-installer -o=custom-columns='STATUS:status.conditions[-1].message' STATUS The installation has completed: Cluster is installed [kni@provisionhost-0-0 ~]$ [kni@provisionhost-0-0 ~]$ oc get cd sno-cluster-deployment -o json | jq -r '.spec.installed' false [kni@provisionhost-0-0 ~]$ ##### from the SNO vm### [root@sno ~]# export KUBECONFIG=/sysroot/ostree/deploy/rhcos/var/lib/kubelet/kubeconfig [root@sno ~]# oc get nodes NAME STATUS ROLES AGE VERSION sno Ready master,worker 59m v1.21.0-rc.0+6825c59 [root@sno ~]# [root@sno ~]# oc get pods -n assisted-installer NAME READY STATUS RESTARTS AGE assisted-installer-controller-n4spz 0/1 Completed 0 69m [root@sno ~]# #### see these pods in error ### [root@sno ~]# oc get pods -A|grep -v Run|grep -v Compl NAMESPACE NAME READY STATUS RESTARTS AGE openshift-kube-controller-manager installer-6-sno 0/1 Error 0 50m openshift-kube-scheduler installer-7-sno 0/1 Error 0 50m
Comment 1 Michael Filanov 2021-05-05 13:54:54 UTC
The reason that we found was related to extra space in the ssh-key. This space is trimmed in the backend and the controller sees this as a change and tries to update the backend again, causing a reconcile loop. There is a good reason for the trim and it resolves a bug that if there is a new line in the ssh key the boot will fail. so the solution should be that the controller match back-end behavior and trim ssh key before comparing it to the backend. In addition, we could try to avoid calling updates in specific cases or specific states.
Comment 2 bjacot 2021-06-11 18:32:47 UTC
I was able to perform an SNO deployment and updated my automation to trim blank space.