Red Hat Bugzilla – Bug 1477787
oci-umount RPM prevents /var/lib/docker/containers from being mounted in fluentd pods
Last modified: 2017-08-15 09:08:02 EDT
On a recently upgraded OpenShift 3.5 cluster, oci-umount package was installed (oci-umount.x86_64 2:1.12.6-48.git0fdc778.el7 @rhel-7-server-extras-rpms), which by defaults contains "/var/lib/docker/containers" in its /etc/oci-umount.conf file, effectively preventing the fluentd pods from collecting docker logs when the json-file driver option is in use.
We used this playbook  to fix the cluster on the fly.
The expectation of oci-umount was that it would unmount everything under /var/lib/docker/containers/* but since the container is run with
/var/lib/docker itself is a mount point. Thus the volume itself is being unmounted, and is in turn ruining at least 1 of the 2 use cases which were the entire point: to allow our logging to work.
I misspoke in #11. It is
/var/lib/docker/containers/ is a bind mount. Even if oci-umount unmounts it (to remove shm mount points underneath it), fluentd containers should still be able to access json based log files present in /var/lib/docker/containers/ directory.
We designed oci-umount in such a way so that fluentd container could access logs. So there is more to the story. Can somebody please explain what exactly is going on.
"Even if oci-umount unmounts it [snip], fluentd containers should still be able to access json based log files present in [it]"
That seems logically inconsistent. This is easy to reproduce:
# rpm -q oci-umount
# cat /etc/oci-umount.conf
# docker run -ti --rm -v /var/lib/docker/containers:/var/lib/docker/containers fedora ls /var/lib/docker/containers/
# docker run -ti --rm -v /var/lib/docker:/var/lib/docker fedora ls /var/lib/docker/containers/
oci-umount is unmounting the bind mount so we can not get to the json files inside the bind mount.
Ok, got it. So in that case fluentd can mount /var/lib/docker and things should work?
-v /var/lib/docker:/var/lib/docker and oci-umount will leave /var/lib/docker mountpoint in place and unmount /var/lib/docker/containers?
Yes, it is possible to work around this regression. However since fluentd has been running successfully for months (years) and is already in production and numerous customer's sites, this regression breaks existing functional systems.
What I don't understand, is they supposedly tested the oci-umount and gave us the patch. I believe my patch will not work, since I misunderstood the way we were handling the umount. The current code umounts the mount points in /etc/oci-umount wherever they are mounted in the container. The way we are umounting it, will get all submounts under these mount points.
Since /var/lib/docker/containers is not usually a mount point, we actually made it into one, just so we could get rid of the "ROOTPATH/var/lib/docker/containers/*/shm". This is the way oci-umount was designed.
(In reply to Eric Paris from comment #7)
> Yes, it is possible to work around this regression. However since fluentd
> has been running successfully for months (years) and is already in
> production and numerous customer's sites, this regression breaks existing
> functional systems.
Ok, I have proposed a PR to solve this issue.
Now if one adds a suffix "/*" to path in /etc/oci-umount.conf then only submounts of that mount will be unmounted. So in this case /var/lib/docker/containers/* has been specified by default and only submounts of /var/lib/docker/containers/ will go away while /var/lib/docker/containers/ will continue to be in container.
I will also clone this bug so that fluentd can move to volume mounting /var/lib/docker/ instead of /var/lib/docker/containers in future.
We also need to figure a way out how to do synchronize our testing efforts. oci-umount work was finished quite some time back and I was under the impression that by now fluentd has been tested and things are working fine. I would prefer to catch these kind of regressions early. Not sure how to do that though.
PR mentioned in comment 9 has been merged now. Have requested lokesh for a new build.
But I think this will solve the issue on either freshly installed systems or systems which are being upgraded from pre oci-umount version. Anything which has oci-umount already, will have old /etc/oci-umount.conf and upgrade will not replace it with new file. That means new package will continue to unmount /var/lib/docker/containers (and not submounts).
For such configurations, we will have to do the manual operation of adding "/*" at the end of /var/lib/docker/containers in /etc/oci-umount.conf.