Description of problem: When deploying a multi-node cluster on all bare-metal nodes, MCP adds an additional crio drop-in that circumvents the container-mount-namespace drop-in, with the result that crio lives in the base namespace and kubelet is in the hidden namespace, and all containers fail to get any kubelet-mounted filesystems (such as secrets and tokens). Version-Release number of selected component (if applicable): Affects all OpenShift versions 4.5 and later in conjunction with the container-mount-namespace workaround originally released in cnf-features-deploy 4.8, but only if MCP's ControllerConfig platform is "baremetal" or "vsphere". How reproducible: 100% on affected platforms Steps to Reproduce: 1. Deploy OpenShift 4.5 or later, multi-node with baremetal or vsphere platform 2. Add the container-mount-namespace workaround from cnf-features-deploy 4.8 Actual results: Crio is running in a different mount namespace from kubelet, therefore containers started by crio do not see secrets or tokens mounted by kubelet Expected results: Crio and kubelet must be in the same mount namespace so that containers started by crio should see secrets and tokens mounted by kubelet
It turns out that the conflicting drop-in owned by MCO for the baremetal and vsphere platforms is redundant and unneeded. CRI-O already respects the environment variable that is being set, without any need for editing the command line in this drop-in. I've opened a PR to remove the unneeded drop-ins here: https://github.com/openshift/machine-config-operator/pull/2858 In addition, I will open a second PR to cnf-features-deploy that will solve this from the other end, making our drop-in compatible with MCO even if it is still applying its drop-in.
QE Verified fixed. MachineConfig looks good. CRI-O and kubelet have the same mount namespace: [core@helix16 ~]$ cat /proc/62743/mountinfo |grep -i namespace 331 330 0:24 /container-mount-namespace /run/container-mount-namespace rw,nosuid,nodev shared:188 - tmpfs tmpfs rw,seclabel,mode=755 [core@helix16 ~]$ cat /proc/10745/mountinfo |grep -i namespace 331 330 0:24 /container-mount-namespace /run/container-mount-namespace rw,nosuid,nodev shared:188 - tmpfs tmpfs rw,seclabel,mode=755
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.10.22 extras update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:5514