Bug 1477787 - oci-umount RPM prevents /var/lib/docker/containers from being mounted in fluentd pods [NEEDINFO]
oci-umount RPM prevents /var/lib/docker/containers from being mounted in flue...
Status: ASSIGNED
Product: OpenShift Container Platform
Classification: Red Hat
Component: Containers (Show other bugs)
3.5.0
All All
unspecified Severity high
: ---
: ---
Assigned To: Vivek Goyal
DeShuai Ma
: OpsBlocker
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-08-02 21:23 EDT by Peter Portante
Modified: 2017-08-15 09:08 EDT (History)
13 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
imcleod: needinfo? (pportant)


Attachments (Terms of Use)

  None (edit)
Description Peter Portante 2017-08-02 21:23:01 EDT
On a recently upgraded OpenShift 3.5 cluster, oci-umount package was installed (oci-umount.x86_64  2:1.12.6-48.git0fdc778.el7  @rhel-7-server-extras-rpms), which by defaults contains "/var/lib/docker/containers" in its /etc/oci-umount.conf file, effectively preventing the fluentd pods from collecting docker logs when the json-file driver option is in use.

We used this playbook [1] to fix the cluster on the fly.

[1] https://github.com/openshift/openshift-ansible-ops/pull/2941
Comment 1 Eric Paris 2017-08-03 09:14:45 EDT
The expectation of oci-umount was that it would unmount everything under /var/lib/docker/containers/*  but since the container is run with

-v /var/lib/docker:/var/lib/docker 

/var/lib/docker itself is a mount point. Thus the volume itself is being unmounted, and is in turn ruining at least 1 of the 2 use cases which were the entire point: to allow our logging to work.
Comment 2 Eric Paris 2017-08-03 09:24:10 EDT
I misspoke in #11. It is

-v /var/lib/docker/containers:/var/lib/docker/containers
Comment 4 Vivek Goyal 2017-08-07 11:30:02 EDT
/var/lib/docker/containers/ is a bind mount. Even if oci-umount unmounts it (to remove shm mount points underneath it), fluentd containers should still be able to access json based log files present in /var/lib/docker/containers/ directory.

We designed oci-umount in such a way so that fluentd container could access logs. So there is more to the story. Can somebody please explain what exactly is going on.
Comment 5 Eric Paris 2017-08-07 11:53:48 EDT
"Even if oci-umount unmounts it [snip], fluentd containers should still be able to access json based log files present in [it]"

That seems logically inconsistent. This is easy to reproduce:

# rpm -q oci-umount
oci-umount-1.13.1-20.git27e468e.fc26.x86_64
# cat /etc/oci-umount.conf 
/var/lib/docker/overlay2
/var/lib/docker/overlay
/var/lib/docker/devicemapper
/var/lib/docker/containers
/var/lib/docker-latest/overlay2
/var/lib/docker-latest/overlay
/var/lib/docker-latest/devicemapper
/var/lib/docker-latest/containers

# docker run -ti --rm -v /var/lib/docker/containers:/var/lib/docker/containers fedora ls /var/lib/docker/containers/
[nothing]

# docker run -ti --rm -v /var/lib/docker:/var/lib/docker fedora ls /var/lib/docker/containers/
b4c66ccbfbbd4b75e9aa8ad0c30f164a0a4730adad40bc50aa4cd77f771d8918

oci-umount is unmounting the bind mount so we can not get to the json files inside the bind mount.
Comment 6 Vivek Goyal 2017-08-07 12:04:27 EDT
Ok, got it. So in that case fluentd can mount /var/lib/docker and things should work?

-v /var/lib/docker:/var/lib/docker and oci-umount will leave /var/lib/docker mountpoint in place and unmount /var/lib/docker/containers?
Comment 7 Eric Paris 2017-08-07 12:49:24 EDT
Yes, it is possible to work around this regression. However since fluentd has been running successfully for months (years) and is already in production and numerous customer's sites, this regression breaks existing functional systems.
Comment 8 Daniel Walsh 2017-08-07 13:11:30 EDT
What I don't understand, is they supposedly tested the oci-umount and gave us the patch.  I believe my patch will not work, since I misunderstood the way we were handling the umount.  The current code umounts the mount points in /etc/oci-umount wherever they are mounted in the container.  The way we are umounting it, will get all submounts under these mount points.

Since /var/lib/docker/containers is not usually a mount point, we actually made it into one, just so we could get rid of the "ROOTPATH/var/lib/docker/containers/*/shm".  This is the way oci-umount was designed.
Comment 9 Vivek Goyal 2017-08-08 09:54:15 EDT
(In reply to Eric Paris from comment #7)
> Yes, it is possible to work around this regression. However since fluentd
> has been running successfully for months (years) and is already in
> production and numerous customer's sites, this regression breaks existing
> functional systems.

Ok, I have proposed a PR to solve this issue.

https://github.com/projectatomic/oci-umount/pull/15

Now if one adds a suffix "/*" to path in /etc/oci-umount.conf then only submounts of that mount will be unmounted. So in this case /var/lib/docker/containers/* has been specified by default and only submounts of /var/lib/docker/containers/ will go away while /var/lib/docker/containers/ will continue to be in container.

I will also clone this bug so that fluentd can move to volume mounting /var/lib/docker/ instead of /var/lib/docker/containers in future.

We also need to figure a way out how to do synchronize our testing efforts. oci-umount work was finished quite some time back and I was under the impression that by now fluentd has been tested and things are working fine. I would prefer to catch these kind of regressions early. Not sure how to do that though.
Comment 10 Vivek Goyal 2017-08-15 08:34:27 EDT
PR mentioned in comment 9 has been merged now. Have requested lokesh for a new build. 

But I think this will solve the issue on either freshly installed systems or systems which are being upgraded from pre oci-umount version. Anything which has oci-umount already, will have old /etc/oci-umount.conf and upgrade will not replace it with new file. That means new package will continue to unmount /var/lib/docker/containers (and not submounts).

For such configurations, we will have to do the manual operation of adding "/*" at the end of /var/lib/docker/containers in /etc/oci-umount.conf.

Note You need to log in before you can comment on or make changes to this bug.