Description of problem: The containerized installation of atomic-openshift-node does not mount the requred volumes for flexvolume to work, that is: - /etc/kubernetes - /usr/libexec/kubernetes/kubelet-plugins/volume/exec/<vendor>~<driver>/<driver> [1] Also, the containerized atomic-openshift-node creates the volumes in /var/lib/origin without sharing it with the host (rprivate mount). To fix this the mount option needs to be shared. [2] [1] https://docs.openshift.com/container-platform/3.9/install_config/persistent_storage/persistent_storage_flex_volume.html [2] https://docs.docker.com/storage/bind-mounts/#configure-bind-propagation Version-Release number of selected component (if applicable): 3.9 containerized Steps to Reproduce: 1. install a containerized cluster 2. try to provision flexvolumes Actual results: Error logs "Unable to mount volumes for pod X." Expected results: atomic-openshift-node being able to mount flexvolume pv's.
For flex plugins the mount is handled by the plugin itself, when it is containarized the plugin should use a mechanism to mount the volume in host's namespace rather than directly mounting the volume inside the container that is running the kubelet. For in-tree plugins that ships with Openshift, this is typically achieved by using nsenter(https://github.com/jpetazzo/nsenter) and a typical mount invocation looks like: nsenter --mount=/rootfs/proc/1/ns/mnt -- /bin/systemd-run --description=... --scope -- /bin/mount -t <type> <what> <where> What this does is, since mount happens on host and any RHEL or recent OS with systemd has "/" mounted as "shared", the new mount point inherits the propagation property and becomes visible. The key thing is - for flex volumes, atomic-openshift-node process is not creating the volumes at all, they are supposed to be created by flex plugin and that is why the plugin should be fixed to support doing containarized mounts.
Also, I just wanted to check if /var/lib/origin itself is bind mounted or something. Can you post output of: findmnt -o +PROPAGATION
The thing is with containerized kubelet, the kubelet is not able to find any installed plugin, hence flexvolume doesn't work. The volume plugin must be installed in the host's directory /usr/libexec/kubernetes/kubelet-plugins/volume/exec/. and that directory is not visible from within the container. Afaik there is any possibility of tuning the kubelet container and include the plugin inside the container - and it can be painful for any customer to maintain its own kubelet images. To make the dir visible, the atomic-openshift-node container must be patched to run with: # make the plugin dir visible within the container --volume=/usr/libexec/kubernetes/kubelet-plugins/volume/exec:/usr/libexec/kubernetes/kubelet-plugins/volume/exec:ro # make the etc/kubernetes visible --volume=/etc/kubernetes:/etc/kubernetes:ro # make /var/lib/origin shared so the kubelet container can share the created volumes with the host, and the other containers will mount those volumes -v /var/lib/origin:/var/lib/origin:rshared (need to use rshared instead of rslave to propagate the mount)
Yes the comment about, plugin installation broken when atomic-openshift-node is accurate. I am yet to test that case, but openshift-ansible likely needs to be fixed to handle the case of flex bootstrapping when openshift-node is containerized. This is bug#1 and engineering will make a fix in openshift-ansible. I was mostly responding to bug#2 - which is when flex plugin running inside kubelet mounts the volume, the volume does not become visible inside the pod. Customer has worked around this by mounting /var/lib/origin as shared, but it can also be fixed by MapR flex plugin, so as it rather than calling a plain "mount", it can do mounting by using nsenter and do mount from within host's namespace. This will make sure that mounted volume is visible inside pod without the need of mounting /var/lib/origin as shared.
I have opened https://github.com/kubernetes/kubernetes/pull/65549 to fix mount/unmount behaviour in kubelet. Also - https://github.com/openshift/openshift-ansible/pull/8773 was merged to fix flexvolume installation on openshift nodes. I will backport and kubernetes core fix, once it is merged in upstream.
Lets keep this in POST for now, because we are still waiting on upstream change to be merged.
@k8s-merge-robot k8s-merge-robot merged commit 3155ea2 into kubernetes:master a day ago
We need a doc PR to instruct the location of flexvolume plugin directory in containerized kubelet.
Verified this is fixed in v3.9.40.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2335