RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1477787 - oci-umount RPM prevents /var/lib/docker/containers from being mounted in fluentd pods
Summary: oci-umount RPM prevents /var/lib/docker/containers from being mounted in flue...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: docker
Version: 7.4
Hardware: All
OS: All
unspecified
high
Target Milestone: rc
: 7.4
Assignee: Vivek Goyal
QA Contact: atomic-bugs@redhat.com
URL:
Whiteboard:
Depends On:
Blocks: 1478821
TreeView+ depends on / blocked
 
Reported: 2017-08-03 01:23 UTC by Peter Portante
Modified: 2021-06-10 12:45 UTC (History)
19 users (show)

Fixed In Version: docker-1.12.6-55.gitc4618fb.el7_4
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-09-05 10:35:14 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3167371 0 None None None 2017-09-06 06:26:50 UTC
Red Hat Product Errata RHBA-2017:2599 0 normal SHIPPED_LIVE docker bug fix and enhancement update 2017-09-05 14:32:57 UTC

Description Peter Portante 2017-08-03 01:23:01 UTC
On a recently upgraded OpenShift 3.5 cluster, oci-umount package was installed (oci-umount.x86_64  2:1.12.6-48.git0fdc778.el7  @rhel-7-server-extras-rpms), which by defaults contains "/var/lib/docker/containers" in its /etc/oci-umount.conf file, effectively preventing the fluentd pods from collecting docker logs when the json-file driver option is in use.

We used this playbook [1] to fix the cluster on the fly.

[1] https://github.com/openshift/openshift-ansible-ops/pull/2941

Comment 1 Eric Paris 2017-08-03 13:14:45 UTC
The expectation of oci-umount was that it would unmount everything under /var/lib/docker/containers/*  but since the container is run with

-v /var/lib/docker:/var/lib/docker 

/var/lib/docker itself is a mount point. Thus the volume itself is being unmounted, and is in turn ruining at least 1 of the 2 use cases which were the entire point: to allow our logging to work.

Comment 2 Eric Paris 2017-08-03 13:24:10 UTC
I misspoke in #11. It is

-v /var/lib/docker/containers:/var/lib/docker/containers

Comment 4 Vivek Goyal 2017-08-07 15:30:02 UTC
/var/lib/docker/containers/ is a bind mount. Even if oci-umount unmounts it (to remove shm mount points underneath it), fluentd containers should still be able to access json based log files present in /var/lib/docker/containers/ directory.

We designed oci-umount in such a way so that fluentd container could access logs. So there is more to the story. Can somebody please explain what exactly is going on.

Comment 5 Eric Paris 2017-08-07 15:53:48 UTC
"Even if oci-umount unmounts it [snip], fluentd containers should still be able to access json based log files present in [it]"

That seems logically inconsistent. This is easy to reproduce:

# rpm -q oci-umount
oci-umount-1.13.1-20.git27e468e.fc26.x86_64
# cat /etc/oci-umount.conf 
/var/lib/docker/overlay2
/var/lib/docker/overlay
/var/lib/docker/devicemapper
/var/lib/docker/containers
/var/lib/docker-latest/overlay2
/var/lib/docker-latest/overlay
/var/lib/docker-latest/devicemapper
/var/lib/docker-latest/containers

# docker run -ti --rm -v /var/lib/docker/containers:/var/lib/docker/containers fedora ls /var/lib/docker/containers/
[nothing]

# docker run -ti --rm -v /var/lib/docker:/var/lib/docker fedora ls /var/lib/docker/containers/
b4c66ccbfbbd4b75e9aa8ad0c30f164a0a4730adad40bc50aa4cd77f771d8918

oci-umount is unmounting the bind mount so we can not get to the json files inside the bind mount.

Comment 6 Vivek Goyal 2017-08-07 16:04:27 UTC
Ok, got it. So in that case fluentd can mount /var/lib/docker and things should work?

-v /var/lib/docker:/var/lib/docker and oci-umount will leave /var/lib/docker mountpoint in place and unmount /var/lib/docker/containers?

Comment 7 Eric Paris 2017-08-07 16:49:24 UTC
Yes, it is possible to work around this regression. However since fluentd has been running successfully for months (years) and is already in production and numerous customer's sites, this regression breaks existing functional systems.

Comment 8 Daniel Walsh 2017-08-07 17:11:30 UTC
What I don't understand, is they supposedly tested the oci-umount and gave us the patch.  I believe my patch will not work, since I misunderstood the way we were handling the umount.  The current code umounts the mount points in /etc/oci-umount wherever they are mounted in the container.  The way we are umounting it, will get all submounts under these mount points.

Since /var/lib/docker/containers is not usually a mount point, we actually made it into one, just so we could get rid of the "ROOTPATH/var/lib/docker/containers/*/shm".  This is the way oci-umount was designed.

Comment 9 Vivek Goyal 2017-08-08 13:54:15 UTC
(In reply to Eric Paris from comment #7)
> Yes, it is possible to work around this regression. However since fluentd
> has been running successfully for months (years) and is already in
> production and numerous customer's sites, this regression breaks existing
> functional systems.

Ok, I have proposed a PR to solve this issue.

https://github.com/projectatomic/oci-umount/pull/15

Now if one adds a suffix "/*" to path in /etc/oci-umount.conf then only submounts of that mount will be unmounted. So in this case /var/lib/docker/containers/* has been specified by default and only submounts of /var/lib/docker/containers/ will go away while /var/lib/docker/containers/ will continue to be in container.

I will also clone this bug so that fluentd can move to volume mounting /var/lib/docker/ instead of /var/lib/docker/containers in future.

We also need to figure a way out how to do synchronize our testing efforts. oci-umount work was finished quite some time back and I was under the impression that by now fluentd has been tested and things are working fine. I would prefer to catch these kind of regressions early. Not sure how to do that though.

Comment 10 Vivek Goyal 2017-08-15 12:34:27 UTC
PR mentioned in comment 9 has been merged now. Have requested lokesh for a new build. 

But I think this will solve the issue on either freshly installed systems or systems which are being upgraded from pre oci-umount version. Anything which has oci-umount already, will have old /etc/oci-umount.conf and upgrade will not replace it with new file. That means new package will continue to unmount /var/lib/docker/containers (and not submounts).

For such configurations, we will have to do the manual operation of adding "/*" at the end of /var/lib/docker/containers in /etc/oci-umount.conf.

Comment 20 Rich Megginson 2017-08-23 20:04:20 UTC
I have tested this with logging and it works.  How soon can we get this into 3.7, 3.6, and 3.5?

Comment 21 Eric Paris 2017-08-23 20:15:59 UTC
If this is listed as a config file in rpm, and we fix the default, users who did not edit will get the new default. Users who editted the file by hand will not get the new default. This is behavior I want.

So is it a config, or a config(noreplace) ?

Comment 22 Rich Megginson 2017-08-23 20:39:24 UTC
(In reply to Eric Paris from comment #21)
> If this is listed as a config file in rpm, and we fix the default, users who
> did not edit will get the new default. Users who editted the file by hand
> will not get the new default. This is behavior I want.
> 
> So is it a config, or a config(noreplace) ?

%config(noreplace) %{_sysconfdir}/oci-umount.conf

Comment 23 Daniel Walsh 2017-08-24 11:46:31 UTC
Eric is right.  I think every user will get the update, doubt any users have actually modified this file.

Comment 32 Rich Megginson 2017-08-24 18:35:14 UTC
(In reply to Vivek Goyal from comment #29)
> (In reply to Rich Megginson from comment #27)
> > > Was /etc/oci-umount.conf untouched or modified before upgrade. I just tested
> > > this on F26 and upgrading oci-umount package worked. It now has new
> > > /etc/oci-umount.conf which has "/var/lib/docker/containers/*"
> > 
> > I don't know.  I didn't personally touch or edit /etc/oci-umount.conf before
> > the upgrade.  Perhaps openshift-ansible does?
> > 
> > At any rate, I guess the consensus is that this should already be handled by
> > rpm upgrade of the oci-umount package.
> 
> What version of docker you are testing with?

-54

> Looks like in -54, we went back
> to old oci-umount. So if you are testing with -54, things will not work.

silly me, thinking that testing with -54 would be as good as testing with -52 . . .
at any rate, once I changed /etc/oci-umount.conf to use "var/lib/docker/containers/*", fluentd worked fine.

Comment 38 errata-xmlrpc 2017-09-05 10:35:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2599


Note You need to log in before you can comment on or make changes to this bug.