This bug was initially created as a copy of Bug #1882191 I am copying this bug because: The root cause of this bug is that they had a registry with a certificate that fails due to this change. The bug that this was copied from resolves part of that problem by setting the GODEBUG environment variable system wide via systemd. However that still leaves one gap and that's Ignition which runs in the initrd and may similarly interface with external endpoints that certificates that fail this validation. Therefore, we should set the environment variable in the initrd as well. This seems like a relatively small gap to close so I don't believe that this should be a 4.6 GA blocker but it'd be nice to get it fixed in early 4.6.z. Description of problem: Performing a OCP 4.6 Installation in a restricted network on zVM fails. The Version-Release number of selected component (if applicable): RHCOS 4.6.0-0.nightly-s390x-2020-09-10-112115 OCP 4.6.0-0.nightly-s390x-2020-09-22-223822 How reproducible: Consistently Steps to Reproduce: 1. Follow steps to configure the mirror host on bastion: https://docs.openshift.com/container-platform/4.5/installing/install_config/installing-restricted-networks-preparations.html 2. Install cluster using restricted network steps: https://docs.openshift.com/container-platform/4.5/installing/installing_bare_metal/installing-restricted-networks-bare-metal.html#installing-restricted-networks-bare-metal 3. IPL the bootstrap and cluster nodes. Actual results: Bootstrap, master and worker nodes all start. However, the master nodes never become Ready: [root@OSPAMGR2 ~]# oc get nodes NAME STATUS ROLES AGE VERSION master-0.ospamgr2-sep22.zvmocp.notld NotReady master 4h1m v1.19.0+8a39924 master-1.ospamgr2-sep22.zvmocp.notld NotReady master 3h56m v1.19.0+8a39924 master-2.ospamgr2-sep22.zvmocp.notld NotReady master 3h48m v1.19.0+8a39924 Preventing the worker nodes from starting. The bootkube.service reports this: Sep 23 23:02:41 bootstrap-0.ospamgr2-sep22.zvmocp.notld bootkube.sh[19435]: E0923 23:02:41.319432 1 reflector.go:251] github.com/openshift/cluster-bootstrap/pkg/start/status.go:66: Failed to watch *v1.Pod: Get "https://localhost:6443/api/v1/pods?watch=true": dial tcp [::1]:6443: connect: connection refused Sep 23 23:02:42 bootstrap-0.ospamgr2-sep22.zvmocp.notld bootkube.sh[19435]: E0923 23:02:42.325119 1 reflector.go:134] github.com/openshift/cluster-bootstrap/pkg/start/status.go:66: Failed to list *v1.Pod: Get "https://localhost:6443/api/v1/pods": dial tcp [::1]:6443: connect: connection refused Sep 23 23:02:43 bootstrap-0.ospamgr2-sep22.zvmocp.notld bootkube.sh[19435]: E0923 23:02:43.327963 1 reflector.go:134] github.com/openshift/cluster-bootstrap/pkg/start/status.go:66: Failed to list *v1.Pod: Get "https://localhost:6443/api/v1/pods": dial tcp [::1]:6443: connect: connection refused Sep 23 23:02:44 bootstrap-0.ospamgr2-sep22.zvmocp.notld bootkube.sh[19435]: E0923 23:02:44.332599 1 reflector.go:134] github.com/openshift/cluster-bootstrap/pkg/start/status.go:66: Failed to list *v1.Pod: Get "https://localhost:6443/api/v1/pods": dial tcp [::1]:6443: connect: connection refused Expected results: Master and worker nodes start successfully Additional info:
Sorry, I forgot to copy/paste what "this change" is. I'm referring to https://golang.google.cn/doc/go1.15#commonname
https://github.com/openshift/machine-config-operator/pull/2141#issuecomment-704989651 is where the discussion as to this need arose
@Benjamin do you think it is reasonable to set the GODEBUG variable for just Ignition in the initrd?
Setting UpcomingSprint keyword as there are other higher priority tasks and issues being worked on.
Yes, I do.
xref https://github.com/openshift/oc/pull/628#issuecomment-725698791 Note this requires a bootimage update; we already have a request for one to pull in the fix for https://github.com/coreos/fedora-coreos-config/pull/733 too.
I do not have access to z system. I verified that OCP 4.7.0-0.nightly-2020-11-24-113830 has the dracut module aand RHCOS 47.83.202011240323-0 has the environment variable set in the initramfs. $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.7.0-0.nightly-2020-11-24-113830 True False 51m Cluster version is 4.7.0-0.nightly-2020-11-24-113830 $ oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-134-48.us-west-2.compute.internal Ready worker 67m v1.19.2+13d6aa9 ip-10-0-146-93.us-west-2.compute.internal Ready master 76m v1.19.2+13d6aa9 ip-10-0-169-22.us-west-2.compute.internal Ready worker 67m v1.19.2+13d6aa9 ip-10-0-177-164.us-west-2.compute.internal Ready master 75m v1.19.2+13d6aa9 ip-10-0-214-17.us-west-2.compute.internal Ready worker 68m v1.19.2+13d6aa9 ip-10-0-221-212.us-west-2.compute.internal Ready master 76m v1.19.2+13d6aa9 $ oc debug node/ip-10-0-134-48.us-west-2.compute.internal Starting pod/ip-10-0-134-48us-west-2computeinternal-debug ... To use host binaries, run `chroot /host` If you don't see a command prompt, try pressing enter. sh-4.2# chroot /host sh-4.4# ls bin dev home lib64 mnt ostree root sbin sys tmp var boot etc lib media opt proc run srv sysroot usr sh-4.4# cat /usr/lib/dracut/modules.d/10 10coreos-sysctl/ 10i18n/ 10ignition-godebug/ sh-4.4# cat /usr/lib/dracut/modules.d/10ignition-godebug/* # https://bugzilla.redhat.com/show_bug.cgi?id=1886134 # Because Ignition which runs in the initrd may interface with external endpoints, # we should set the environment variable in the initrd [Manager] DefaultEnvironment=GODEBUG=x509ignoreCN=0 #!/bin/bash # -*- mode: shell-script; indent-tabs-mode: nil; sh-basic-offset: 4; -*- # ex: ts=8 sw=4 sts=4 et filetype=sh depends() { echo systemd } install() { inst_simple "$moddir/10-default-env-godebug.conf" \ "/etc/systemd/system.conf.d/10-default-env-godebug.conf" } sh-4.4# exit exit sh-4.2# exit exit Removing debug pod ... $ oc debug node/ip-10-0-146-93.us-west-2.compute.internal Starting pod/ip-10-0-146-93us-west-2computeinternal-debug ... To use host binaries, run `chroot /host` If you don't see a command prompt, try pressing enter. sh-4.2# chroot /host sh-4.4# cat /usr/lib/dracut/modules.d/10ignition-godebug/* # https://bugzilla.redhat.com/show_bug.cgi?id=1886134 # Because Ignition which runs in the initrd may interface with external endpoints, # we should set the environment variable in the initrd [Manager] DefaultEnvironment=GODEBUG=x509ignoreCN=0 #!/bin/bash # -*- mode: shell-script; indent-tabs-mode: nil; sh-basic-offset: 4; -*- # ex: ts=8 sw=4 sts=4 et filetype=sh depends() { echo systemd } install() { inst_simple "$moddir/10-default-env-godebug.conf" \ "/etc/systemd/system.conf.d/10-default-env-godebug.conf" } sh-4.4# exit exit sh-4.2# exit exit Removing debug pod ... Entering emergency mode. Exit the shell to continue. Type "journalctl" to view system logs. You might want to save "/run/initramfs/rdsosreport.txt" to a USB stick or /boot after mounting them and attach it to a bug report. :/# :/# :/# env DRACUT_SYSTEMD=1 rflags= INVOCATION_ID=1ab6d4613bad44678bcc88fba29c164f hook=emergency PWD=/ root= fstype=auto HOME=/ JOURNAL_STREAM=9:13127 UDEVVERSION=239 hookdir=/lib/dracut/hooks NEWROOT=/sysroot DEBUG_MEM_LEVEL=0 action=Boot TERM=vt220 GODEBUG=x509ignoreCN=0 SHLVL=1 RD_DEBUG=no PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin PS1=:${PWD}# _rdshell_name=dracut _=/usr/bin/env
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633