Description of problem: Now that node resolution is not required for bootstrapping, we no longer need to use mdns to build the list of nodes. We can just wait until the api is up and retrieve it from there. We need to remove the mdns coredns plugin from on-prem platforms in MCO and remove the mdns security group rules in the installer.
Verified from the MCO side that the coredns files were changed and removed. $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.8.0-0.nightly-2021-06-13-101614 True False 77m Cluster version is 4.8.0-0.nightly-2021-06-13-101614 $ oc get nodes NAME STATUS ROLES AGE VERSION mnguyen061410-7m2fq-master-0 Ready master 98m v1.21.0-rc.0+120883f mnguyen061410-7m2fq-master-1 Ready master 96m v1.21.0-rc.0+120883f mnguyen061410-7m2fq-master-2 Ready master 98m v1.21.0-rc.0+120883f mnguyen061410-7m2fq-worker-nsf26 Ready worker 90m v1.21.0-rc.0+120883f mnguyen061410-7m2fq-worker-qrhvq Ready worker 90m v1.21.0-rc.0+120883f mnguyen061410-7m2fq-worker-s4jjd Ready worker 90m v1.21.0-rc.0+120883f $ oc debug node/mnguyen061410-7m2fq-master-0 Starting pod/mnguyen061410-7m2fq-master-0-debug ... To use host binaries, run `chroot /host` If you don't see a command prompt, try pressing enter. sh-4.2# chroot /host sh-4.4# cat /etc/kubernetes/static-pod-resources/coredns/Corefile.tmpl . { errors health :18080 forward . {{- range $upstream := .DNSUpstreams}} {{$upstream}}{{- end}} { policy sequential } cache 30 reload template IN {{ .Cluster.IngressVIPRecordType }} mnguyen061410.qe.devcluster.openshift.com { match .*.apps.mnguyen061410.qe.devcluster.openshift.com answer "{{"{{ .Name }}"}} 60 in {{"{{ .Type }}"}} 172.31.248.84" fallthrough } template IN {{ .Cluster.IngressVIPEmptyType }} mnguyen061410.qe.devcluster.openshift.com { match .*.apps.mnguyen061410.qe.devcluster.openshift.com fallthrough } template IN {{ .Cluster.APIVIPRecordType }} mnguyen061410.qe.devcluster.openshift.com { match api.mnguyen061410.qe.devcluster.openshift.com answer "{{"{{ .Name }}"}} 60 in {{"{{ .Type }}"}} 172.31.248.83" fallthrough } template IN {{ .Cluster.APIVIPEmptyType }} mnguyen061410.qe.devcluster.openshift.com { match api.mnguyen061410.qe.devcluster.openshift.com fallthrough } template IN {{ .Cluster.APIVIPRecordType }} mnguyen061410.qe.devcluster.openshift.com { match api-int.mnguyen061410.qe.devcluster.openshift.com answer "{{"{{ .Name }}"}} 60 in {{"{{ .Type }}"}} 172.31.248.83" fallthrough } template IN {{ .Cluster.APIVIPEmptyType }} mnguyen061410.qe.devcluster.openshift.com { match api-int.mnguyen061410.qe.devcluster.openshift.com fallthrough } hosts { {{- range .Cluster.NodeAddresses }} {{ .Address }} {{ .Name }} {{ .Name }}.{{ $.Cluster.Name }}.{{ $.Cluster.Domain }} {{- end }} fallthrough } } sh-4.4# cat /etc/kubernetes/manifests/coredns.yaml kind: Pod apiVersion: v1 metadata: name: coredns namespace: openshift-vsphere-infra creationTimestamp: deletionGracePeriodSeconds: 65 labels: app: vsphere-infra-mdns spec: volumes: - name: resource-dir hostPath: path: "/etc/kubernetes/static-pod-resources/coredns" - name: kubeconfig hostPath: path: "/var/lib/kubelet" - name: conf-dir hostPath: path: "/etc/coredns" - name: nm-resolv hostPath: path: "/var/run/NetworkManager" initContainers: - name: render-config-coredns image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:99f031207070be623af0cbda37e36667cbd11dae917a88479c0c6bb375dac282 command: - runtimecfg - render - "/var/lib/kubelet/kubeconfig" - "--api-vip" - "172.31.248.83" - "--ingress-vip" - "172.31.248.84" - "/config" - "--out-dir" - "/etc/coredns" resources: {} volumeMounts: - name: kubeconfig mountPath: "/var/lib/kubelet" - name: resource-dir mountPath: "/config" - name: conf-dir mountPath: "/etc/coredns" imagePullPolicy: IfNotPresent containers: - name: coredns securityContext: privileged: true image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:4b74076f1f7fa27993b9009ed49aefcdb4df9125efff8e44e66e38b9512947ab args: - "--conf" - "/etc/coredns/Corefile" resources: requests: cpu: 100m memory: 200Mi volumeMounts: - name: conf-dir mountPath: "/etc/coredns" livenessProbe: httpGet: path: /health port: 18080 scheme: HTTP initialDelaySeconds: 60 timeoutSeconds: 5 successThreshold: 1 failureThreshold: 5 terminationMessagePolicy: FallbackToLogsOnError imagePullPolicy: IfNotPresent - name: coredns-monitor securityContext: privileged: true image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:99f031207070be623af0cbda37e36667cbd11dae917a88479c0c6bb375dac282 command: - corednsmonitor - "/var/lib/kubelet/kubeconfig" - "/config/Corefile.tmpl" - "/etc/coredns/Corefile" - "--api-vip" - "172.31.248.83" - "--ingress-vip" - "172.31.248.84" resources: requests: cpu: 100m memory: 200Mi volumeMounts: - name: kubeconfig mountPath: "/var/lib/kubelet" - name: resource-dir mountPath: "/config" - name: conf-dir mountPath: "/etc/coredns" - name: nm-resolv mountPath: "/var/run/NetworkManager" imagePullPolicy: IfNotPresent hostNetwork: true tolerations: - operator: Exists priorityClassName: system-node-critical status: {} sh-4.4# cd /etc/coredns sh-4.4# ls Corefile sh-4.4#
Verified on nightly build 4.8.0-0.nightly-2021-06-14-145150 on vsphere platform and passed. Bootstrap server is removed automatically after bootstrap completed, and installation is successful. On bootstrap server: mdns removed from bootkube.sh # cat /usr/local/bin/bootkube.sh | grep -i mdns # Also removed from configure file Corefile # cat /etc/kubernetes/static-pod-resources/coredns/Corefile.tmpl . { errors health :18080 forward . {{- range $upstream := .DNSUpstreams}} {{$upstream}}{{- end}} { policy sequential } cache 30 reload template IN {{ .Cluster.IngressVIPRecordType }} jima1946506.qe.devcluster.openshift.com { match .*.apps.jima1946506.qe.devcluster.openshift.com answer "{{"{{ .Name }}"}} 60 in {{"{{ .Type }}"}} 172.31.248.88" fallthrough } template IN {{ .Cluster.IngressVIPEmptyType }} jima1946506.qe.devcluster.openshift.com { match .*.apps.jima1946506.qe.devcluster.openshift.com fallthrough } template IN {{ .Cluster.APIVIPRecordType }} jima1946506.qe.devcluster.openshift.com { match api.jima1946506.qe.devcluster.openshift.com answer "{{"{{ .Name }}"}} 60 in {{"{{ .Type }}"}} 172.31.248.87" fallthrough } template IN {{ .Cluster.APIVIPEmptyType }} jima1946506.qe.devcluster.openshift.com { match api.jima1946506.qe.devcluster.openshift.com fallthrough } template IN {{ .Cluster.APIVIPRecordType }} jima1946506.qe.devcluster.openshift.com { match api-int.jima1946506.qe.devcluster.openshift.com answer "{{"{{ .Name }}"}} 60 in {{"{{ .Type }}"}} 172.31.248.87" fallthrough } template IN {{ .Cluster.APIVIPEmptyType }} jima1946506.qe.devcluster.openshift.com { match api-int.jima1946506.qe.devcluster.openshift.com fallthrough } } # crictl exec cdef475f0b575 cat /etc/coredns/Corefile . { errors health :18080 forward . 10.3.192.12 { policy sequential } cache 30 reload template IN A jima1946506.qe.devcluster.openshift.com { match .*.apps.jima1946506.qe.devcluster.openshift.com answer "{{ .Name }} 60 in {{ .Type }} 172.31.248.88" fallthrough } template IN AAAA jima1946506.qe.devcluster.openshift.com { match .*.apps.jima1946506.qe.devcluster.openshift.com fallthrough } template IN A jima1946506.qe.devcluster.openshift.com { match api.jima1946506.qe.devcluster.openshift.com answer "{{ .Name }} 60 in {{ .Type }} 172.31.248.87" fallthrough } template IN AAAA jima1946506.qe.devcluster.openshift.com { match api.jima1946506.qe.devcluster.openshift.com fallthrough } template IN A jima1946506.qe.devcluster.openshift.com { match api-int.jima1946506.qe.devcluster.openshift.com answer "{{ .Name }} 60 in {{ .Type }} 172.31.248.87" fallthrough } template IN AAAA jima1946506.qe.devcluster.openshift.com { match api-int.jima1946506.qe.devcluster.openshift.com fallthrough } }
Verified on 4.8.0-0.nightly-2021-06-14-145150 over OSP16.1 (RHOS-16.1-RHEL-8-20210323.n.0). Installation was performed correctly: time="2021-06-15T04:10:33-04:00" level=debug msg="Time elapsed per stage:" time="2021-06-15T04:10:33-04:00" level=debug msg=" Infrastructure: 2m19s" time="2021-06-15T04:10:33-04:00" level=debug msg="Bootstrap Complete: 10m52s" time="2021-06-15T04:10:33-04:00" level=debug msg=" API: 1m15s" time="2021-06-15T04:10:33-04:00" level=debug msg=" Bootstrap Destroy: 36s" time="2021-06-15T04:10:33-04:00" level=debug msg=" Cluster Operators: 24m58s" time="2021-06-15T04:10:33-04:00" level=info msg="Time elapsed: 40m19s" (shiftstack) [stack@undercloud-0 ~]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.8.0-0.nightly-2021-06-14-145150 True False 38m Cluster version is 4.8.0-0.nightly-2021-06-14-145150 (shiftstack) [stack@undercloud-0 ~]$ oc get clusteroperators NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.8.0-0.nightly-2021-06-14-145150 True False False 39m baremetal 4.8.0-0.nightly-2021-06-14-145150 True False False 61m cloud-credential 4.8.0-0.nightly-2021-06-14-145150 True False False 71m cluster-autoscaler 4.8.0-0.nightly-2021-06-14-145150 True False False 64m config-operator 4.8.0-0.nightly-2021-06-14-145150 True False False 69m console 4.8.0-0.nightly-2021-06-14-145150 True False False 46m csi-snapshot-controller 4.8.0-0.nightly-2021-06-14-145150 True False False 65m dns 4.8.0-0.nightly-2021-06-14-145150 True False False 62m etcd 4.8.0-0.nightly-2021-06-14-145150 True False False 66m image-registry 4.8.0-0.nightly-2021-06-14-145150 True False False 54m ingress 4.8.0-0.nightly-2021-06-14-145150 True False False 53m insights 4.8.0-0.nightly-2021-06-14-145150 True False False 60m kube-apiserver 4.8.0-0.nightly-2021-06-14-145150 True False False 53m kube-controller-manager 4.8.0-0.nightly-2021-06-14-145150 True False False 65m kube-scheduler 4.8.0-0.nightly-2021-06-14-145150 True False False 65m kube-storage-version-migrator 4.8.0-0.nightly-2021-06-14-145150 True False False 67m machine-api 4.8.0-0.nightly-2021-06-14-145150 True False False 58m machine-approver 4.8.0-0.nightly-2021-06-14-145150 True False False 65m machine-config 4.8.0-0.nightly-2021-06-14-145150 True False False 53m marketplace 4.8.0-0.nightly-2021-06-14-145150 True False False 64m monitoring 4.8.0-0.nightly-2021-06-14-145150 True False False 51m network 4.8.0-0.nightly-2021-06-14-145150 True False False 68m node-tuning 4.8.0-0.nightly-2021-06-14-145150 True False False 65m openshift-apiserver 4.8.0-0.nightly-2021-06-14-145150 True False False 63m openshift-controller-manager 4.8.0-0.nightly-2021-06-14-145150 True False False 66m openshift-samples 4.8.0-0.nightly-2021-06-14-145150 True False False 61m operator-lifecycle-manager 4.8.0-0.nightly-2021-06-14-145150 True False False 64m operator-lifecycle-manager-catalog 4.8.0-0.nightly-2021-06-14-145150 True False False 64m operator-lifecycle-manager-packageserver 4.8.0-0.nightly-2021-06-14-145150 True False False 63m service-ca 4.8.0-0.nightly-2021-06-14-145150 True False False 67m storage 4.8.0-0.nightly-2021-06-14-145150 True False False 65m (shiftstack) [stack@undercloud-0 ~]$ openstack server list +--------------------------------------+-----------------------------+--------+-------------------------------------+--------------------+--------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+-----------------------------+--------+-------------------------------------+--------------------+--------+ | 36faaadf-b5a8-4d80-839f-7370c8607a7d | ostest-74m7l-worker-0-4gsft | ACTIVE | ostest-74m7l-openshift=10.196.2.14 | ostest-74m7l-rhcos | | | ac222a24-d67b-4c84-ac6c-9d3dabdb25c9 | ostest-74m7l-worker-0-57mfl | ACTIVE | ostest-74m7l-openshift=10.196.0.168 | ostest-74m7l-rhcos | | | 5318d079-b085-43cc-884d-7a6492eebe1e | ostest-74m7l-worker-0-pkqkd | ACTIVE | ostest-74m7l-openshift=10.196.3.234 | ostest-74m7l-rhcos | | | 64be49fb-3eab-4f44-80a6-5bd4a782261a | ostest-74m7l-master-2 | ACTIVE | ostest-74m7l-openshift=10.196.3.209 | ostest-74m7l-rhcos | | | ef9bf712-0089-4720-9d5c-0fa233a5a966 | ostest-74m7l-master-1 | ACTIVE | ostest-74m7l-openshift=10.196.0.76 | ostest-74m7l-rhcos | | | 54384bf7-4735-46c6-a389-0ebec7b567fc | ostest-74m7l-master-0 | ACTIVE | ostest-74m7l-openshift=10.196.1.235 | ostest-74m7l-rhcos | | +--------------------------------------+-----------------------------+--------+-------------------------------------+--------------------+--------+
Closing as verified on 4.8.0-0.nightly-2021-06-14-145150
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438