RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1427820 - systemd container umounts local mounts(of host) when /dev exported.
Summary: systemd container umounts local mounts(of host) when /dev exported.
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: systemd
Version: 7.3
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: systemd-maint
QA Contact: qe-baseos-daemons
URL:
Whiteboard:
Depends On:
Blocks: 1427823
TreeView+ depends on / blocked
 
Reported: 2017-03-01 10:21 UTC by Mohamed Ashiq
Modified: 2020-06-16 14:10 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-06-16 14:10:30 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Mohamed Ashiq 2017-03-01 10:21:51 UTC
Description of problem:
local mounts of host unmounted when /dev is exported to systemd container. This behavior is not seen in all cases, /var should be a separate filesystem(not in filesystem used in /).  

Version-Release number of selected component (if applicable):
systemd in rhel7.3 base image

How reproducible:
with /var separate filesystem, always 

Steps to Reproduce:
Have a separate filesystem mounted on /var 

# docker run -v /dev:/dev --privileged <any init/systemd container> 

Actual results:
starts umounting all local mounts

Expected results:
should start without any issue

Additional info:

Comment 2 Mohamed Ashiq 2017-03-01 10:27:19 UTC
My setup has /var/, /var/log/, /home/, /boot/, swap and /(root) as separate fs.

When I try with plain systemd container I hit the agetty issue which we faced and it consumes whole memory so lost the machine. I could not retrieve anything from it. Created a image with systemd and mask getty.
Things which I noticed are

* When the pod starts local mounts start unmounting.

* The issue is reproducible with space left in all the fs.

* No service is failing.

* I tried to have few local mounts in another setup which had no separate /var or /var/log, I mounted a small lv on /home. Start of container did not cause any issue.

Things found out:

While mounting takes place in the container, systemd sends SIGTERM to systemd-journald. That is the exact point the node starts unmounting. Also there is service running only in these situation which looks suspicious systemd-tmpfiles-setup-dev.service(Started Create Static Device Nodes in /dev).
In some cases, I saw SIGTERM(15) is sent from systemd to all the services on the host.

The usual test setup which has no local mounts for /var and /var/log, there is no SIGTERM sent to systemd-journald. I hope this is helpful.

Comment 5 Ju Lim 2017-03-01 15:12:17 UTC
This is part of the Container Native Storage (CNS) initiative for OpenShift.  For any kind of storage (e.g. Gluster or Ceph) which has udev dependencies, resolving this issue is important.

Comment 6 Michal Sekletar 2017-03-02 08:51:10 UTC
(In reply to Ju Lim from comment #5)
> This is part of the Container Native Storage (CNS) initiative for OpenShift.
> For any kind of storage (e.g. Gluster or Ceph) which has udev dependencies,
> resolving this issue is important.

Please point us to docker images you are using in your Kubernetes cluster. If we are to do something about this problem we need to reproduce it locally first.

Note that I found this repo that looks related,

https://github.com/gluster/gluster-kubernetes

Can I follow "Quickstart" section from there to setup this in vagrant VM?

Comment 8 Mohamed Ashiq 2017-03-03 07:42:28 UTC
(In reply to Michal Sekletar from comment #6)
> (In reply to Ju Lim from comment #5)
> > This is part of the Container Native Storage (CNS) initiative for OpenShift.
> > For any kind of storage (e.g. Gluster or Ceph) which has udev dependencies,
> > resolving this issue is important.
> 
> Please point us to docker images you are using in your Kubernetes cluster.
> If we are to do something about this problem we need to reproduce it locally
> first.
> 
> Note that I found this repo that looks related,
> 
> https://github.com/gluster/gluster-kubernetes
> 
> Can I follow "Quickstart" section from there to setup this in vagrant VM?

Hi,

The issue is reproducible on plain docker. What has to be done is 

# docker run -d --privileged --net=host -v /dev:/dev -v /sys/fs/cgroup:/sys/fs/cgroup:ro -v /var/lib/glusterd:/var/lib/glusterd:z brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhgs3/rhgs-server-rhel7:3.1.3-16 

Image name: brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhgs3/rhgs-server-rhel7:3.1.3-16

This is the last image without the workaround/hack. Basically any image with systemd and host having separate fs on /var. this issue is reproducible.
I tested with above. 

If you want our complete cns setup. then yeah you are in right path. Just use the downstream Image. 

same downstream bits are available here:
http://download-node-02.eng.bos.redhat.com/rel-eng/RHGS/3.1-u3-cns-RHEL-7/latest/Server-RH-Gluster-3-Server/x86_64/os/Packages/

also available live.

you can follow the documentation here:
https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.1/html/container-native_storage_for_openshift_container_platform/

Thanks for taking this up. Let me know if you need anything.

Comment 9 Mohamed Ashiq 2017-03-03 08:08:10 UTC
(In reply to Mohamed Ashiq from comment #8)
> (In reply to Michal Sekletar from comment #6)
> > (In reply to Ju Lim from comment #5)
> > > This is part of the Container Native Storage (CNS) initiative for OpenShift.
> > > For any kind of storage (e.g. Gluster or Ceph) which has udev dependencies,
> > > resolving this issue is important.
> > 
> > Please point us to docker images you are using in your Kubernetes cluster.
> > If we are to do something about this problem we need to reproduce it locally
> > first.
> > 
> > Note that I found this repo that looks related,
> > 
> > https://github.com/gluster/gluster-kubernetes
> > 
> > Can I follow "Quickstart" section from there to setup this in vagrant VM?
> 
> Hi,
> 
> The issue is reproducible on plain docker. What has to be done is 
> 
> # docker run -d --privileged --net=host -v /dev:/dev -v
> /sys/fs/cgroup:/sys/fs/cgroup:ro -v /var/lib/glusterd:/var/lib/glusterd:z
> brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhgs3/rhgs-server-rhel7:
> 3.1.3-16 
> 
> Image name:
> brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhgs3/rhgs-server-rhel7:
> 3.1.3-16
> 
> This is the last image without the workaround/hack. Basically any image with
> systemd and host having separate fs on /var. this issue is reproducible.
> I tested with above. 
> 
> If you want our complete cns setup. then yeah you are in right path. Just
> use the downstream Image. 
> 
> same downstream bits are available here:
> http://download-node-02.eng.bos.redhat.com/rel-eng/RHGS/3.1-u3-cns-RHEL-7/
> latest/Server-RH-Gluster-3-Server/x86_64/os/Packages/
> 
> also available live.
> 
> you can follow the documentation here:
> https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.1/
> html/container-native_storage_for_openshift_container_platform/
> 
> Thanks for taking this up. Let me know if you need anything.

make sure you have port 24007 open on host.

I would prefer using a dockerfile with content:

#######################
FROM rhel

ENV container docker

RUN systemctl mask getty.target

CMD ["/usr/sbin/init"]

#######################

# docker run -d -v /dev:/dev --privileged --net=host -v /sys/fs/cgroup:/sys/fs/cgroup:ro <IMAGENAME> 

and Host /var should be on separate mounted fs than /.

This should do it.

The systemd version in use was "systemd-219-19.el7_2.13.x86_64" when we hit the issue.

Comment 10 Daniel Walsh 2017-03-10 18:06:02 UTC
Can we run this container as a system container rather then as a container running under docker?

If we do, could we do the whole thing as a chroot rather then using any kind of container technology.  Make sure that you are sharing the host with the udev rules?

The code would still need to differentiate between 

/ and /host, but it would share everything else with the host, and would not have to deal with the container runtime or things like the mount namespace.

Only problem would be this can not be orchestrated via k8s it would need to be installed using something like Ansible on each container node.

Comment 11 Giuseppe Scrivano 2017-03-10 18:48:02 UTC
The container should not be able to mount/umount from /dev.  Can you check what propagation are used for /dev?

From the host:

# cat /proc/self/mountinfo | grep " /dev "
# cat /proc/$ID_OF_THE_init_PROCESS_IN_THE_CONTAINER/mountinfo | grep " /dev "

Also, and this hopefully will suffice to fix your problem, the --privileged is too wide, you probably don't want this.  I recommend that you start without caps and that you add caps (--add-cap) that are required by your container instead.

For instance systemd-tmpfiles-setup-dev.service which is the service that is causing this issue depends on `CAP_SYS_MODULE`.

Does this container require to mount kernel modules?  You probably don't want it, so you can try adding a --cap-drop=CAP_SYS_MODULE to your docker run.


Note You need to log in before you can comment on or make changes to this bug.