Bug 1567851 - [cee/sd][ceph-container] after update on RHEL 7.5, ceph containers are not starting if selinux is Enforcing
Summary: [cee/sd][ceph-container] after update on RHEL 7.5, ceph containers are not st...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Container
Version: 3.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: 3.1
Assignee: Sébastien Han
QA Contact: Harish NV Rao
Aron Gunn
URL:
Whiteboard:
Depends On:
Blocks: 1572368
TreeView+ depends on / blocked
 
Reported: 2018-04-16 09:58 UTC by Tomas Petr
Modified: 2018-09-26 19:17 UTC (History)
13 users (show)

Fixed In Version: ceph-ansible-3.1.0-0.1.rc9.el7cp Ubuntu: ceph-ansible_3.1.0~rc9-2redhat1
Doc Type: Bug Fix
Doc Text:
.Upgrading Red Hat Enterprise Linux from 7.4 to 7.5 no longer fails starting the Ceph containers when SELinux enforcing is enabled Previously, when upgrading Red Hat Enterprise Linux 7.4 to 7.5, and if SELinux enforcing was enabled, then containerized Red Hat Ceph Storage deployments would fail to start on a reboot. In this release, this has been fixed by removing `chcon` calls, allowing the Ceph containers to run when SELinux enforcing mode is enabled.
Clone Of:
Environment:
Last Closed: 2018-09-26 19:16:42 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github ceph ceph-ansible pull 2526 0 'None' closed selinux: remove chcon calls 2021-02-10 11:17:17 UTC
Red Hat Knowledge Base (Solution) 3421431 0 None None None 2018-04-23 15:51:03 UTC
Red Hat Product Errata RHBA-2018:2820 0 None None None 2018-09-26 19:17:25 UTC

Description Tomas Petr 2018-04-16 09:58:06 UTC
Description of problem:
We have deployed Ceph3 containerized env on RHEL 7.4 with selinux enforcing.
After " yum update -y; reboot" to RHEL 7.5, the ceph containers are not starting with permission denied
-------
Apr 16 05:49:10 mons-2.container.quicklab.pnq2.cee.redhat.com dockerd-current[1125]: mktemp: failed to create directory via template '/var/lib/ceph/tmp/tmp.XXXXXXXXXX': Permission denied
Apr 16 05:49:10 mons-2.container.quicklab.pnq2.cee.redhat.com docker[28805]: mktemp: failed to create directory via template '/var/lib/ceph/tmp/tmp.XXXXXXXXXX': Permission denied
-------
Selinux contex has changed:
[root@mons-2 ~]# ls -laZ /var/lib/ceph/tmp/
drwxr-x---. ceph ceph system_u:object_r:ceph_var_lib_t:s0 .
drwxr-x---. ceph ceph system_u:object_r:ceph_var_lib_t:s0 ..
drwx------. ceph ceph system_u:object_r:ceph_var_lib_t:s0 tmp.EJduARzNRI < not runnig
drwx------. ceph ceph system_u:object_r:ceph_var_lib_t:s0 tmp.kQeOR6UmnN < not runnig
drwx------. ceph ceph system_u:object_r:container_file_t:s0 tmp.t8zLsF7KjB <-- previously container runnig fine
drwx------. ceph ceph system_u:object_r:container_file_t:s0 tmp.U9NNOM2aXC <-- previously container runnig fine
[root@mons-2 ~]# ls -la /var/lib/ceph/tmp/
total 0
drwxr-x---.  6 ceph ceph  94 Apr 16 05:16 .
drwxr-x---. 13 ceph ceph 181 Mar  8 16:41 ..
drwx------.  2 ceph ceph   6 Apr 16 05:16 tmp.EJduARzNRI < not runnig
drwx------.  2 ceph ceph   6 Apr 16 05:16 tmp.kQeOR6UmnN < not runnig
drwx------.  2 ceph ceph   6 Apr 16 02:07 tmp.t8zLsF7KjB <-- previously container runnig fine
drwx------.  2 ceph ceph   6 Apr 16 02:03 tmp.U9NNOM2aXC <-- previously container runnig fine
-------

If we set selinux to permissive mode, the containers start ok.


Version-Release number of selected component (if applicable):
selinux-policy-targeted-3.13.1-192.el7_5.3.noarch
libselinux-2.5-12.el7.x86_64
libselinux-utils-2.5-12.el7.x86_64
selinux-policy-3.13.1-192.el7_5.3.noarch
libselinux-python-2.5-12.el7.x86_64
ceph-selinux-12.2.1-46.el7cp.x86_64
container-selinux-2.55-1.el7.noarch


How reproducible:
always

Steps to Reproduce:
1. deploy container RHCS 3 + RHEL 7.4, selinux enforcing
2. yum update -y
3. reboot
4. Ceph containers are not starting
5. set selinux to permissive mode, containers start

Actual results:


Expected results:


Additional info:

Comment 3 Tomas Petr 2018-04-16 09:59:04 UTC
Apr 16 05:49:09 mons-2.container.quicklab.pnq2.cee.redhat.com dockerd-current[1125]: time="2018-04-16T05:49:09.624917374-04:00" level=error msg="Handler for POST /v1.26/containers/ceph-mgr-mons-2/stop returned error: No such container: ceph-mgr-mons-2"
Apr 16 05:49:09 mons-2.container.quicklab.pnq2.cee.redhat.com docker[28783]: Error response from daemon: No such container: ceph-mgr-mons-2
Apr 16 05:49:09 mons-2.container.quicklab.pnq2.cee.redhat.com dockerd-current[1125]: time="2018-04-16T05:49:09.625057488-04:00" level=error msg="Handler for DELETE /v1.26/containers/ceph-mon-mons-2 returned error: No such container: ceph-mon-mons-2"
Apr 16 05:49:09 mons-2.container.quicklab.pnq2.cee.redhat.com dockerd-current[1125]: time="2018-04-16T05:49:09.625415462-04:00" level=error msg="Handler for POST /v1.26/containers/ceph-mgr-mons-2/stop returned error: No such container: ceph-mgr-mons-2"
Apr 16 05:49:09 mons-2.container.quicklab.pnq2.cee.redhat.com dockerd-current[1125]: time="2018-04-16T05:49:09.625496562-04:00" level=error msg="Handler for DELETE /v1.26/containers/ceph-mon-mons-2 returned error: No such container: ceph-mon-mons-2"
Apr 16 05:49:09 mons-2.container.quicklab.pnq2.cee.redhat.com docker[28782]: Error response from daemon: No such container: ceph-mon-mons-2
Apr 16 05:49:09 mons-2.container.quicklab.pnq2.cee.redhat.com systemd[1]: Started Ceph Monitor.
Apr 16 05:49:09 mons-2.container.quicklab.pnq2.cee.redhat.com dockerd-current[1125]: time="2018-04-16T05:49:09.655240668-04:00" level=error msg="Handler for DELETE /v1.26/containers/ceph-mgr-mons-2 returned error: No such container: ceph-mgr-mons-2"
Apr 16 05:49:09 mons-2.container.quicklab.pnq2.cee.redhat.com dockerd-current[1125]: time="2018-04-16T05:49:09.655495130-04:00" level=error msg="Handler for DELETE /v1.26/containers/ceph-mgr-mons-2 returned error: No such container: ceph-mgr-mons-2"
Apr 16 05:49:09 mons-2.container.quicklab.pnq2.cee.redhat.com docker[28792]: Error response from daemon: No such container: ceph-mgr-mons-2
Apr 16 05:49:09 mons-2.container.quicklab.pnq2.cee.redhat.com systemd[1]: Started Ceph Manager.
Apr 16 05:49:09 mons-2.container.quicklab.pnq2.cee.redhat.com systemd[1]: Started libcontainer container 22e05c238d0796645f4b1aead31489037ab7b7bad6bb533a3e65af5f2fd1730e.
Apr 16 05:49:09 mons-2.container.quicklab.pnq2.cee.redhat.com systemd[1]: Starting libcontainer container 22e05c238d0796645f4b1aead31489037ab7b7bad6bb533a3e65af5f2fd1730e.
Apr 16 05:49:09 mons-2.container.quicklab.pnq2.cee.redhat.com systemd[1]: Started libcontainer container dfe00a1dd294334971d65714e1a5d24cc4d473c54ff13f46591168e92d652c6d.
Apr 16 05:49:09 mons-2.container.quicklab.pnq2.cee.redhat.com systemd[1]: Starting libcontainer container dfe00a1dd294334971d65714e1a5d24cc4d473c54ff13f46591168e92d652c6d.
Apr 16 05:49:09 mons-2.container.quicklab.pnq2.cee.redhat.com kernel: SELinux: mount invalid.  Same superblock, different security settings for (dev mqueue, type mqueue)
Apr 16 05:49:09 mons-2.container.quicklab.pnq2.cee.redhat.com kernel: SELinux: mount invalid.  Same superblock, different security settings for (dev mqueue, type mqueue)
Apr 16 05:49:09 mons-2.container.quicklab.pnq2.cee.redhat.com oci-systemd-hook[28868]: systemdhook <debug>: 22e05c238d07: Skipping as container command is /entrypoint.sh, not init or systemd
Apr 16 05:49:09 mons-2.container.quicklab.pnq2.cee.redhat.com oci-umount[28873]: umounthook <debug>: prestart container_id:22e05c238d07 rootfs:/var/lib/docker/overlay2/cc4667de7e9ce5c64f306c46e374feaff8ac1ba071b3024b3592f34a63f00fca/merged
Apr 16 05:49:09 mons-2.container.quicklab.pnq2.cee.redhat.com oci-systemd-hook[28875]: systemdhook <debug>: dfe00a1dd294: Skipping as container command is /entrypoint.sh, not init or systemd
Apr 16 05:49:09 mons-2.container.quicklab.pnq2.cee.redhat.com oci-umount[28876]: umounthook <debug>: prestart container_id:dfe00a1dd294 rootfs:/var/lib/docker/overlay2/41ad62c1f0bd2812224b8288a7efe717fcf4b12b7eba79ff56dc6b260a0d44e8/merged
Apr 16 05:49:10 mons-2.container.quicklab.pnq2.cee.redhat.com dockerd-current[1125]: mktemp: failed to create directory via template '/var/lib/ceph/tmp/tmp.XXXXXXXXXX': Permission denied
Apr 16 05:49:10 mons-2.container.quicklab.pnq2.cee.redhat.com docker[28805]: mktemp: failed to create directory via template '/var/lib/ceph/tmp/tmp.XXXXXXXXXX': Permission denied
Apr 16 05:49:10 mons-2.container.quicklab.pnq2.cee.redhat.com dockerd-current[1125]: mktemp: failed to create directory via template '/var/lib/ceph/tmp/tmp.XXXXXXXXXX': Permission denied
Apr 16 05:49:10 mons-2.container.quicklab.pnq2.cee.redhat.com docker[28794]: mktemp: failed to create directory via template '/var/lib/ceph/tmp/tmp.XXXXXXXXXX': Permission denied
Apr 16 05:49:10 mons-2.container.quicklab.pnq2.cee.redhat.com oci-systemd-hook[28946]: systemdhook <debug>: dfe00a1dd294: Skipping as container command is /entrypoint.sh, not init or systemd
Apr 16 05:49:10 mons-2.container.quicklab.pnq2.cee.redhat.com oci-umount[28947]: umounthook <debug>: dfe00a1dd294: only runs in prestart stage, ignoring
Apr 16 05:49:10 mons-2.container.quicklab.pnq2.cee.redhat.com dockerd-current[1125]: time="2018-04-16T05:49:10.107175247-04:00" level=error msg="containerd: deleting container" error="exit status 1: \"container dfe00a1dd294334971d65714e1a5d24cc4d473c54ff13f46591168e92d652c6d does not exist\\none or more of the container deletions failed\\n\""
Apr 16 05:49:10 mons-2.container.quicklab.pnq2.cee.redhat.com oci-systemd-hook[28958]: systemdhook <debug>: 22e05c238d07: Skipping as container command is /entrypoint.sh, not init or systemd
Apr 16 05:49:10 mons-2.container.quicklab.pnq2.cee.redhat.com oci-umount[28959]: umounthook <debug>: 22e05c238d07: only runs in prestart stage, ignoring
Apr 16 05:49:10 mons-2.container.quicklab.pnq2.cee.redhat.com dockerd-current[1125]: time="2018-04-16T05:49:10.119029744-04:00" level=error msg="containerd: deleting container" error="exit status 1: \"container 22e05c238d0796645f4b1aead31489037ab7b7bad6bb533a3e65af5f2fd1730e does not exist\\none or more of the container deletions failed\\n\""
Apr 16 05:49:10 mons-2.container.quicklab.pnq2.cee.redhat.com dockerd-current[1125]: time="2018-04-16T05:49:10.149779899-04:00" level=warning msg="dfe00a1dd294334971d65714e1a5d24cc4d473c54ff13f46591168e92d652c6d cleanup: failed to unmount secrets: invalid argument"
Apr 16 05:49:10 mons-2.container.quicklab.pnq2.cee.redhat.com dockerd-current[1125]: time="2018-04-16T05:49:10.182241158-04:00" level=warning msg="22e05c238d0796645f4b1aead31489037ab7b7bad6bb533a3e65af5f2fd1730e cleanup: failed to unmount secrets: invalid argument"
Apr 16 05:49:10 mons-2.container.quicklab.pnq2.cee.redhat.com systemd[1]: ceph-mgr: main process exited, code=exited, status=1/FAILURE
Apr 16 05:49:10 mons-2.container.quicklab.pnq2.cee.redhat.com dockerd-current[1125]: time="2018-04-16T05:49:10.219366441-04:00" level=error msg="Handler for POST /v1.26/containers/ceph-mgr-mons-2/stop returned error: No such container: ceph-mgr-mons-2"
Apr 16 05:49:10 mons-2.container.quicklab.pnq2.cee.redhat.com dockerd-current[1125]: time="2018-04-16T05:49:10.219649961-04:00" level=error msg="Handler for POST /v1.26/containers/ceph-mgr-mons-2/stop returned error: No such container: ceph-mgr-mons-2"
Apr 16 05:49:10 mons-2.container.quicklab.pnq2.cee.redhat.com docker[28966]: Error response from daemon: No such container: ceph-mgr-mons-2
Apr 16 05:49:10 mons-2.container.quicklab.pnq2.cee.redhat.com systemd[1]: Unit ceph-mgr entered failed state.
Apr 16 05:49:10 mons-2.container.quicklab.pnq2.cee.redhat.com systemd[1]: ceph-mgr failed.
Apr 16 05:49:10 mons-2.container.quicklab.pnq2.cee.redhat.com systemd[1]: ceph-mon: main process exited, code=exited, status=1/FAILURE

Comment 4 Tomas Petr 2018-04-16 10:00:02 UTC
SELinux is preventing /usr/bin/mktemp from write access on the directory tmp.

*****  Plugin catchall (100. confidence) suggests   **************************

If you believe that mktemp should be allowed write access on the tmp directory by default.
Then you should report this as a bug.
You can generate a local policy module to allow this access.
Do
allow this access for now by executing:
# ausearch -c 'mktemp' --raw | audit2allow -M my-mktemp
# semodule -i my-mktemp.pp


Additional Information:
Source Context                system_u:system_r:container_t:s0:c384,c657
Target Context                system_u:object_r:ceph_var_lib_t:s0
Target Objects                tmp [ dir ]
Source                        mktemp
Source Path                   /usr/bin/mktemp
Port                          <Unknown>
Host                          <Unknown>
Source RPM Packages           coreutils-8.22-21.el7.x86_64
Target RPM Packages           
Policy RPM                    selinux-policy-3.13.1-192.el7_5.3.noarch
Selinux Enabled               True
Policy Type                   targeted
Enforcing Mode                Enforcing
Host Name                     mons-2.container.quicklab.pnq2.cee.redhat.com
Platform                      Linux
                              mons-2.container.quicklab.pnq2.cee.redhat.com
                              3.10.0-862.el7.x86_64 #1 SMP Wed Mar 21 18:14:51
                              EDT 2018 x86_64 x86_64
Alert Count                   1
First Seen                    2018-04-16 05:35:14 EDT
Last Seen                     2018-04-16 05:35:14 EDT
Local ID                      59faa3b1-010e-45ea-884f-50328bc0f65e

Raw Audit Messages
type=AVC msg=audit(1523871314.212:1560): avc:  denied  { write } for  pid=13477 comm="mktemp" name="tmp" dev="vda1" ino=41952155 scontext=system_u:system_r:container_t:s0:c384,c657 tcontext=system_u:object_r:ceph_var_lib_t:s0 tclass=dir


type=SYSCALL msg=audit(1523871314.212:1560): arch=x86_64 syscall=mkdir success=no exit=EACCES a0=1d050b0 a1=1c0 a2=22 a3=7ffdee0816a0 items=0 ppid=13476 pid=13477 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm=mktemp exe=/usr/bin/mktemp subj=system_u:system_r:container_t:s0:c384,c657 key=(null)

Hash: mktemp,container_t,ceph_var_lib_t,dir,write

Comment 5 Tomas Petr 2018-04-16 10:05:19 UTC
Trying to work it around with enforcing selinux:

# audit2allow -a
#============= container_t ==============
allow container_t ceph_var_lib_t:dir { add_name create read setattr write };


ausearch -c 'mktemp' --raw | audit2allow -M my-mktemp
semodule -i my-mktemp.pp


Then fails with:

Apr 16 05:56:36 mons-2.container.quicklab.pnq2.cee.redhat.com docker[4771]: Error response from daemon: No such container: ceph-mgr-mons-2
Apr 16 05:56:36 mons-2.container.quicklab.pnq2.cee.redhat.com systemd[1]: Started Ceph Manager.
Apr 16 05:56:36 mons-2.container.quicklab.pnq2.cee.redhat.com systemd[1]: Started libcontainer container 2a001c107195ff5d380b7b84b6a4d0480d6d497031a3767e1e623920792bd954.
Apr 16 05:56:36 mons-2.container.quicklab.pnq2.cee.redhat.com systemd[1]: Starting libcontainer container 2a001c107195ff5d380b7b84b6a4d0480d6d497031a3767e1e623920792bd954.
Apr 16 05:56:36 mons-2.container.quicklab.pnq2.cee.redhat.com kernel: SELinux: mount invalid.  Same superblock, different security settings for (dev mqueue, type mqueue)
Apr 16 05:56:36 mons-2.container.quicklab.pnq2.cee.redhat.com oci-systemd-hook[4809]: systemdhook <debug>: 2a001c107195: Skipping as container command is /entrypoint.sh, not init or systemd
Apr 16 05:56:36 mons-2.container.quicklab.pnq2.cee.redhat.com oci-umount[4810]: umounthook <debug>: prestart container_id:2a001c107195 rootfs:/var/lib/docker/overlay2/d8030f632de9f64667aafadd432271812a1b12af10bce12e71ba144a4c679f35/merged
Apr 16 05:56:36 mons-2.container.quicklab.pnq2.cee.redhat.com dockerd-current[1125]: find: '/var/lib/ceph/': Permission denied
Apr 16 05:56:36 mons-2.container.quicklab.pnq2.cee.redhat.com docker[4751]: find: '/var/lib/ceph/': Permission denied
Apr 16 05:56:36 mons-2.container.quicklab.pnq2.cee.redhat.com oci-systemd-hook[4895]: systemdhook <debug>: 2a001c107195: Skipping as container command is /entrypoint.sh, not init or systemd
Apr 16 05:56:36 mons-2.container.quicklab.pnq2.cee.redhat.com oci-umount[4896]: umounthook <debug>: 2a001c107195: only runs in prestart stage, ignoring
Apr 16 05:56:36 mons-2.container.quicklab.pnq2.cee.redhat.com systemd[1]: Started libcontainer container 7d20529749d35c1340a7e5031f0a183cc393a910599be725a43acbdbb0107428.
Apr 16 05:56:36 mons-2.container.quicklab.pnq2.cee.redhat.com systemd[1]: Starting libcontainer container 7d20529749d35c1340a7e5031f0a183cc393a910599be725a43acbdbb0107428.


[root@mons-2 ~]# audit2allow -a
#============= container_t ==============
allow container_t ceph_var_lib_t:dir { read setattr };

#!!!! This avc is allowed in the current policy
allow container_t ceph_var_lib_t:dir { add_name create write };

--------------------------------------------------------------------------------

SELinux is preventing /usr/bin/find from read access on the directory ceph.

*****  Plugin catchall (100. confidence) suggests   **************************

If you believe that find should be allowed read access on the ceph directory by default.
Then you should report this as a bug.
You can generate a local policy module to allow this access.
Do
allow this access for now by executing:
# ausearch -c 'find' --raw | audit2allow -M my-find
# semodule -i my-find.pp


Additional Information:
Source Context                system_u:system_r:container_t:s0:c242,c847
Target Context                system_u:object_r:ceph_var_lib_t:s0
Target Objects                ceph [ dir ]
Source                        find
Source Path                   /usr/bin/find
Port                          <Unknown>
Host                          <Unknown>
Source RPM Packages           findutils-4.5.11-5.el7.x86_64
Target RPM Packages           
Policy RPM                    selinux-policy-3.13.1-192.el7_5.3.noarch
Selinux Enabled               True
Policy Type                   targeted
Enforcing Mode                Enforcing
Host Name                     mons-2.container.quicklab.pnq2.cee.redhat.com
Platform                      Linux
                              mons-2.container.quicklab.pnq2.cee.redhat.com
                              3.10.0-862.el7.x86_64 #1 SMP Wed Mar 21 18:14:51
                              EDT 2018 x86_64 x86_64
Alert Count                   1
First Seen                    2018-04-16 06:02:36 EDT
Last Seen                     2018-04-16 06:02:36 EDT
Local ID                      1324c97d-b7a4-43ab-b3f4-68007d87e5a3

Raw Audit Messages
type=AVC msg=audit(1523872956.294:5799): avc:  denied  { read } for  pid=13824 comm="find" name="ceph" dev="vda1" ino=29360673 scontext=system_u:system_r:container_t:s0:c242,c847 tcontext=system_u:object_r:ceph_var_lib_t:s0 tclass=dir


type=SYSCALL msg=audit(1523872956.294:5799): arch=x86_64 syscall=openat success=no exit=EACCES a0=ffffffffffffff9c a1=1e01a40 a2=10900 a3=0 items=0 ppid=13706 pid=13824 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm=find exe=/usr/bin/find subj=system_u:system_r:container_t:s0:c242,c847 key=(null)

Hash: find,container_t,ceph_var_lib_t,dir,read

[root@mons-2 ~]# ausearch -c 'find' --raw | audit2allow -M my-find; semodule -i my-find.pp
******************** IMPORTANT ***********************
To make this policy package active, execute:

semodule -i my-find.pp


Then finally the mon and mgr containers start:

[root@mons-2 ~]#  ps -ef | egrep "ceph|rados|docker"
root      1125     1  1 05:25 ?        00:00:23 /usr/bin/dockerd-current --add-runtime docker-runc=/usr/libexec/docker/docker-runc-current --default-runtime=docker-runc --authorization-plugin=rhel-push-plugin --exec-opt native.cgroupdriver=systemd --userland-proxy-path=/usr/libexec/docker/docker-proxy-current --init-path=/usr/libexec/docker/docker-init-current --seccomp-profile=/etc/docker/seccomp.json --selinux-enabled --log-driver=journald --signature-verification=false --add-registry registry.access.redhat.com
root      1222  1125  0 05:25 ?        00:00:04 /usr/bin/docker-containerd-current -l unix:///var/run/docker/libcontainerd/docker-containerd.sock --metrics-interval=0 --start-timeout 2m --state-dir /var/run/docker/libcontainerd/containerd --shim docker-containerd-shim --runtime docker-runc --runtime-args --systemd-cgroup=true
root      1411     1  0 05:25 ?        00:00:01 /usr/libexec/docker/rhel-push-plugin
root     15586     1  1 06:03 ?        00:00:00 /usr/bin/docker-current run --rm --name ceph-mon-mons-2 --net=host --memory=1g --cpu-quota=100000 -v /var/lib/ceph:/var/lib/ceph -v /etc/ceph:/etc/ceph -v /etc/localtime:/etc/localtime:ro --net=host -e IP_VERSION=4 -e MON_IP=10.74.157.245 -e CLUSTER=ceph -e FSID=78a165f7-8ec1-4595-9950-a8b7672dc9e9 -e CEPH_PUBLIC_NETWORK=10.74.156.0/22 -e CEPH_DAEMON=MON registry.access.redhat.com/rhceph/rhceph-3-rhel7:latest
root     15597     1  1 06:03 ?        00:00:00 /usr/bin/docker-current run --rm --net=host --memory=1g --cpu-quota=100000 -v /var/lib/ceph:/var/lib/ceph -v /etc/ceph:/etc/ceph -v /etc/localtime:/etc/localtime:ro -e CLUSTER=ceph -e CEPH_DAEMON=MGR --name=ceph-mgr-mons-2 registry.access.redhat.com/rhceph/rhceph-3-rhel7:latest
root     15618  1222  0 06:03 ?        00:00:00 /usr/bin/docker-containerd-shim-current 371aee859610d7782f25057d8845f30458fb44cf6f752b44b1fbcd05c0446129 /var/run/docker/libcontainerd/371aee859610d7782f25057d8845f30458fb44cf6f752b44b1fbcd05c0446129 /usr/libexec/docker/docker-runc-current
root     15631  1222  0 06:03 ?        00:00:00 /usr/bin/docker-containerd-shim-current 3c1c7bc35804975b72e99ffafb73a614304497243162bc538e2a22717719726b /var/run/docker/libcontainerd/3c1c7bc35804975b72e99ffafb73a614304497243162bc538e2a22717719726b /usr/libexec/docker/docker-runc-current
root     16003 15637  0 06:03 ?        00:00:00 timeout 7 ceph --cluster ceph mon add mons-2 10.74.157.245:6789
root     16008 16003 16 06:03 ?        00:00:00 /usr/bin/python2.7 /usr/bin/ceph --cluster ceph mon add mons-2 10.74.157.245:6789
ceph     16012 15653  6 06:03 ?        00:00:00 /usr/bin/ceph-mgr --cluster ceph --setuser ceph --setgroup ceph -d -i mons-2
root     16044  2064  0 06:03 pts/0    00:00:00 grep -E --color=auto ceph|rados|docker

Comment 6 Tomas Petr 2018-04-16 11:13:09 UTC
Note:
problem has showed on RGW, MON+MRG node
not showed on OSD node at all, osd started normally, not sealert
not tested with MDS and iSCSI

Comment 7 Sébastien Han 2018-04-17 13:23:17 UTC
I think I know what this is. Trying a patch for this.

Comment 8 Sébastien Han 2018-04-17 13:35:38 UTC
Hopefully with this patch Docker will maintain the SeLinux labels across reboot.
The upstream CI does not run on 7.5 since it's not released yet. We can't fully validate this but at least make sure we don't break things.

Comment 9 Sébastien Han 2018-04-17 15:37:50 UTC
Tomas, could you please relabel /var/lib/ceph and /etc/ceph directories like this after the reboot:

chcon -Rt svirt_sandbox_file_t /var/lib/ceph /etc/ceph


That's should be the workaround to that issue.
Thanks.

Comment 10 Tomas Petr 2018-04-17 15:51:21 UTC
(In reply to leseb from comment #9)
> Tomas, could you please relabel /var/lib/ceph and /etc/ceph directories like
> this after the reboot:
> 
> chcon -Rt svirt_sandbox_file_t /var/lib/ceph /etc/ceph
> 
> 
> That's should be the workaround to that issue.
> Thanks.

Hi Sebastien,
thanks, that WA worked as well:

# chcon -Rt svirt_sandbox_file_t /var/lib/ceph /etc/ceph

[root@mons-1 ~]# ps -ef | grep ceph
root     24378     1  0 11:47 ?        00:00:00 /usr/bin/docker-current run --rm --name ceph-mon-mons-1 --net=host --memory=1g --cpu-quota=100000 -v /var/lib/ceph:/var/lib/ceph -v /etc/ceph:/etc/ceph -v /etc/localtime:/etc/localtime:ro --net=host -e IP_VERSION=4 -e MON_IP=10.74.157.247 -e CLUSTER=ceph -e FSID=78a165f7-8ec1-4595-9950-a8b7672dc9e9 -e CEPH_PUBLIC_NETWORK=10.74.156.0/22 -e CEPH_DAEMON=MON registry.access.redhat.com/rhceph/rhceph-3-rhel7:latest
root     24407     1  0 11:47 ?        00:00:00 /usr/bin/docker-current run --rm --net=host --memory=1g --cpu-quota=100000 -v /var/lib/ceph:/var/lib/ceph -v /etc/ceph:/etc/ceph -v /etc/localtime:/etc/localtime:ro -e CLUSTER=ceph -e CEPH_DAEMON=MGR --name=ceph-mgr-mons-1 registry.access.redhat.com/rhceph/rhceph-3-rhel7:latest
ceph     24627 24508  1 11:47 ?        00:00:00 /usr/bin/ceph-mgr --cluster ceph --setuser ceph --setgroup ceph -d -i mons-1
ceph     24675 24428  1 11:47 ?        00:00:00 /usr/bin/ceph-mon --cluster ceph --setuser ceph --setgroup ceph -d -i mons-1 --mon-data /var/lib/ceph/mon/ceph-mons-1 --public-addr 10.74.157.247:6789

Regards, Tomas

Comment 11 Sébastien Han 2018-04-17 16:04:27 UTC
This is annoying though as I don't reproduce this on 7.4.
I hope my patch will re-apply the label each time the container starts, otherwise, I'll fix that differently.

Tomas, since you have the env could you please, reboot again (I'm expecting you to lose the Selinux labels) then edit one of the systemd unit and change the -v like so:

https://github.com/ceph/ceph-ansible/pull/2526/files#diff-9d6f3197018a38e7f772112f9bd62f19R17

Basically add :z at the end of each -v for /var/lib/ceph and /etc/ceph

Then reboot again and see if the label get re-applied. If you could help with this that'll really great.

I'm thinking of keeping the :z even if the labels don't get re-applied anyways. If that's the case I'll fix the unit file itself by adding a pre-start command (chcon).

Thanks in advance!

Comment 12 Tomas Petr 2018-04-17 16:19:44 UTC
(In reply to leseb from comment #11)
> This is annoying though as I don't reproduce this on 7.4.
> I hope my patch will re-apply the label each time the container starts,
> otherwise, I'll fix that differently.
> 
> Tomas, since you have the env could you please, reboot again (I'm expecting
> you to lose the Selinux labels) then edit one of the systemd unit and change
> the -v like so:
> 
> https://github.com/ceph/ceph-ansible/pull/2526/files#diff-
> 9d6f3197018a38e7f772112f9bd62f19R17
> 
> Basically add :z at the end of each -v for /var/lib/ceph and /etc/ceph
> 
> Then reboot again and see if the label get re-applied. If you could help
> with this that'll really great.
> 
> I'm thinking of keeping the :z even if the labels don't get re-applied
> anyways. If that's the case I'll fix the unit file itself by adding a
> pre-start command (chcon).
> 
> Thanks in advance!

Seems it worded as well:

[root@mons-0 ~]# cat /etc/systemd/system/ceph-mgr@.service 
[Unit]
Description=Ceph Manager
After=docker.service

[Service]
EnvironmentFile=-/etc/environment
ExecStartPre=-/usr/bin/docker stop ceph-mgr-mons-0
ExecStartPre=-/usr/bin/docker rm ceph-mgr-mons-0
ExecStart=/usr/bin/docker run --rm --net=host \
  --memory=1g \
  --cpu-quota=100000 \
  -v /var/lib/ceph:/var/lib/ceph:z \ #<--------
  -v /etc/ceph:/etc/ceph:z \  #<--------
  -v /etc/localtime:/etc/localtime:ro \
  -e CLUSTER=ceph \
  -e CEPH_DAEMON=MGR \
   \
  --name=ceph-mgr-mons-0 \
  registry.access.redhat.com/rhceph/rhceph-3-rhel7:latest
ExecStopPost=-/usr/bin/docker stop ceph-mgr-mons-0
Restart=always
RestartSec=10s
TimeoutStartSec=120
TimeoutStopSec=15

[Install]
WantedBy=multi-user.target
----------------------------
[root@mons-0 ~]# reboot

[root@mons-0 ~]# ps -ef | egrep "ceph"
root      1422     1  0 12:16 ?        00:00:00 /usr/bin/docker-current run --rm --name ceph-mon-mons-0 --net=host --memory=1g --cpu-quota=100000 -v /var/lib/ceph:/var/lib/ceph -v /etc/ceph:/etc/ceph -v /etc/localtime:/etc/localtime:ro --net=host -e IP_VERSION=4 -e MON_IP=10.74.157.243 -e CLUSTER=ceph -e FSID=78a165f7-8ec1-4595-9950-a8b7672dc9e9 -e CEPH_PUBLIC_NETWORK=10.74.156.0/22 -e CEPH_DAEMON=MON registry.access.redhat.com/rhceph/rhceph-3-rhel7:latest
root      1434     1  0 12:16 ?        00:00:00 /usr/bin/docker-current run --rm --net=host --memory=1g --cpu-quota=100000 -v /var/lib/ceph:/var/lib/ceph:z -v /etc/ceph:/etc/ceph:z -v /etc/localtime:/etc/localtime:ro -e CLUSTER=ceph -e CEPH_DAEMON=MGR --name=ceph-mgr-mons-0 registry.access.redhat.com/rhceph/rhceph-3-rhel7:latest
ceph      1688  1497  2 12:16 ?        00:00:01 /usr/bin/ceph-mgr --cluster ceph --setuser ceph --setgroup ceph -d -i mons-0
ceph      1731  1496  1 12:16 ?        00:00:00 /usr/bin/ceph-mon --cluster ceph --setuser ceph --setgroup ceph -d -i mons-0 --mon-data /var/lib/ceph/mon/ceph-mons-0 --public-addr 10.74.157.243:6789

Comment 13 Sébastien Han 2018-04-17 16:28:48 UTC
Thanks a lot Tomas for validating this for me.

Comment 14 Federico Lucifredi 2018-06-18 14:08:13 UTC
Fix already in 3.1.

Comment 15 Ken Dreyer (Red Hat) 2018-07-10 18:17:06 UTC
"selinux: remove chcon calls" has been in stable-3.1 for a long time (since v3.1.0beta8).

I'm setting Fixed In Version to the latest available ceph-ansible NVR.

Comment 18 Vasishta 2018-08-09 07:53:04 UTC
Hi,

We are planning to verify the fix by following steps -

1) Using latest available ceph-ansible and container image (that comes with RHCS 3.1), with selinux in enforcing mode, Create a cluster with OSDs of different scenarios ON RHEL- VERSION 7.4

2) Upgrade host OS TO 7.5 and reboot reboot all nodes in rolling fashion and verify that all daemons are up and running once host is back after reboot.

Kindly let us know if there are any concerns on the steps we have planned, before EOD tomorrow (Friday).

Regards,
Vasishta 
QE, Ceph

Comment 19 Sébastien Han 2018-08-10 12:30:11 UTC
lgtm.

Comment 20 Vasishta 2018-08-13 08:25:27 UTC
providing qa_ack

Comment 21 Vasishta 2018-08-13 13:59:41 UTC
Followed Steps mentioned planned and as mentioned in Comment 18, Working fine.

Verified in -
ceph-ansible-3.1.0-0.1.rc17.el7cp.noarch
ceph-3.1-rhel-7-containers-candidate-38485-20180810211451 (12.2.5-38.el7cp)

Will move BZ to VERIFIED state once BZ comes ON_QA.

Thanks.

Comment 24 errata-xmlrpc 2018-09-26 19:16:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2820


Note You need to log in before you can comment on or make changes to this bug.