1466848 – Restart of atomic-openshift-node container terminates pod glusterfs mount

Bug 1466848 - Restart of atomic-openshift-node container terminates pod glusterfs mount

Summary: Restart of atomic-openshift-node container terminates pod glusterfs mount

Keywords:
Status:	CLOSED DUPLICATE of bug 1472370
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Storage
Sub Component:
Version:	3.5.0
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	3.7.0
Assignee:	Jan Safranek
QA Contact:	Jianwei Hou
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-06-30 14:23 UTC by Jan Safranek
Modified:	2017-09-11 08:48 UTC (History)
CC List:	19 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	1423640
Environment:
Last Closed:	2017-09-11 08:48:15 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Comment 1 Jan Safranek 2017-06-30 14:27:10 UTC

I lowered the severity from the original bug, AFAIK no customer is complaining so far.

Comment 2 Jan Safranek 2017-06-30 14:29:42 UTC

Pruned bug dependencies

Comment 3 Jan Safranek 2017-06-30 14:43:34 UTC

I am talking to local systemd guys about escaping a docker container properly so fuse daemon runs really on the host and restart of docker container won't kill it.

Option 1:
Newer systemd (v233?) ships systemd-mount, which creates an transient unit file that mounts. Fuse daemon would probably run in its context. In the container we would probably do 'nsenter --mount=/rootfs/proc/1/ns/mnt -- /bin/systemd-mount -t glusterfs -o <opts> <what> <where>' (testing needed).

Unfortunately, RHEL7 has too old systemd and systemd-mount is not there and rebase is not planned. Backport could be possible though.

Option 2:
systemd in RHEL7 has systemd-run command, which creates a transient service and executes something there. kubelet would do `nsenter --mount=/rootfs/proc/1/ns/mnt -- /bin/systemd-run /bin/mount -t glusterfs -o <opts> <what> <where>'. Again, testing needed as I am not sure if the service would not be killed by systemd when /bin/mount finishes and only glusterfs fuse daemon is running.

I'm investigating these options.

Obviously, both these options will make openshift-node container dependent on the host running systemd. So far that was not hard requirement.

Any other smart ideas how to escape a container are welcome.

Comment 4 Jan Safranek 2017-06-30 15:17:08 UTC

Tested option 2, this looks working:

nsenter --mount=/rootfs/proc/1/ns/mnt -- systemd-run --scope /bin/mount -t glusterfs 172.17.0.2:test_vol /var/lib/origin/openshift.local.volumes/xyz

(and nsenter --mount=/rootfs/proc/1/ns/mnt -- umount /var/lib/origin/openshift.local.volumes/xyz)

- glusterfs fuse daemon runs in its own systemd slice (=cgroup) with a random name (run-11615.scope)
- it is not killed when /bin/mount finishes
- it is killed by unmount
- the slice is automatically deleted when the last process dies, i.e. after unmount


That brings us to hard dependency on systemd on the host... In OpenShift it's probably OK, I am not sure about upstream.

Comment 5 Jan Safranek 2017-07-03 13:19:31 UTC

created https://github.com/kubernetes/kubernetes/pull/48430, above systemd-run call is used when it's available on the host, otherwise simple 'nsenter --mount=/rootfs/proc/1/ns/mnt -- mount' is used.

Comment 6 Jan Safranek 2017-09-11 08:48:15 UTC


*** This bug has been marked as a duplicate of bug 1472370 ***

Note You need to log in before you can comment on or make changes to this bug.