Bug 1923719 - [CephAdm] 5.0 - Unable to deploy OSDS with RHCEPH-5.0-RHEL-8-20210129.ci.8
Summary: [CephAdm] 5.0 - Unable to deploy OSDS with RHCEPH-5.0-RHEL-8-20210129.ci.8
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Cephadm
Version: 5.0
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
: 5.0
Assignee: Juan Miguel Olmo
QA Contact: Vasishta
Karen Norteman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-02-01 17:31 UTC by Pragadeeswaran Sathyanarayanan
Modified: 2021-08-30 08:28 UTC (History)
9 users (show)

Fixed In Version: ceph-16.1.0-486.el8cp
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-08-30 08:28:17 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 49239 0 None None None 2021-02-10 15:24:06 UTC
Red Hat Bugzilla 1921807 0 unspecified CLOSED RHCS 5 container image uses weak dependencies 2021-08-30 09:59:52 UTC
Red Hat Issue Tracker RHCEPH-1232 0 None None None 2021-08-30 00:18:09 UTC
Red Hat Product Errata RHBA-2021:3294 0 None None None 2021-08-30 08:28:39 UTC

Description Pragadeeswaran Sathyanarayanan 2021-02-01 17:31:55 UTC
Description of problem:
Unable to deploy OSDS using the below command

$ cephadm shell -- ceph orch apply osd --all-available-devices.

The below error is thrown
=========================
RuntimeError: Failed command: /bin/podman run --rm --ipc=host --authfile=/etc/ceph/podman-auth.json --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk -e CONTAINER_IMAGE=registry-proxy.engineering.redhat.com/rh-osbs/rhceph:ceph-5.0-rhel-8-containers-candidate-26800-20210129231628 -e NODE_NAME=ceph-prag00-1612189570966-node5-osdrgw -e CEPH_VOLUME_OSDSPEC_AFFINITY=all-available-devices -v /var/log/ceph/a405fd0c-649b-11eb-b73c-fa163e24d39d:/var/log/ceph:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /tmp/ceph-tmphqbvmlhb:/etc/ceph/ceph.conf:z -v /tmp/ceph-tmp26myoi65:/var/lib/ceph/bootstrap-osd/ceph.keyring:z registry-proxy.engineering.redhat.com/rh-osbs/rhceph:ceph-5.0-rhel-8-containers-candidate-26800-20210129231628 lvm batch --no-auto /dev/vdb /dev/vdc /dev/vdd /dev/vde --yes --no-systemd
debug 2021-02-01T17:16:45.086+0000 7f14eac96700 -1 log_channel(cephadm) log [ERR] : executing create_from_spec_one(([('ceph-prag00-1612189570966-node1-installermgrmon', <ceph.deployment.drive_selection.selector.DriveSelection object at 0x7f14ec469f28>), ('ceph-prag00-1612189570966-node3-clientmonosd', <ceph.deployment.drive_selection.selector.DriveSelection object at 0x7f14eb52f160>), ('ceph-prag00-1612189570966-node4-mdsosd', <ceph.deployment.drive_selection.selector.DriveSelection object at 0x7f14eb52f240>), ('ceph-prag00-1612189570966-node5-osdrgw', <ceph.deployment.drive_selection.selector.DriveSelection object at 0x7f14eb52f6d8>), ('ceph-prag00-1612189570966-node2-mgrmonosd', <ceph.deployment.drive_selection.selector.DriveSelection object at 0x7f14eb52fac8>)],)) failed.
Traceback (most recent call last):
  File "/usr/share/ceph/mgr/cephadm/utils.py", line 70, in do_work
    return f(*arg)
  File "/usr/share/ceph/mgr/cephadm/services/osd.py", line 48, in create_from_spec_one
    host, cmd, replace_osd_ids=osd_id_claims.get(host, []), env_vars=env_vars
  File "/usr/share/ceph/mgr/cephadm/services/osd.py", line 68, in create_single_host
    code, '\n'.join(err)))
RuntimeError: cephadm exited with an error code: 1, stderr:/bin/podman: stderr --> passed data devices: 4 physical, 0 LVM
/bin/podman: stderr --> relative data size: 1.0
/bin/podman: stderr Running command: /usr/bin/ceph-authtool --gen-print-key
/bin/podman: stderr Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 0b4ee397-5be6-4bc2-a174-fcb3c1ef59f3
/bin/podman: stderr Running command: /usr/sbin/vgcreate --force --yes ceph-fcd8ebc9-a413-40fe-bcff-23dee2b41a14 /dev/vdb
/bin/podman: stderr  stderr: selabel_open failed: No such file or directory
/bin/podman: stderr   selabel_open failed: No such file or directory
/bin/podman: stderr  stderr: selabel_open failed: No such file or directory
/bin/podman: stderr  stdout: Physical volume "/dev/vdb" successfully created.
/bin/podman: stderr  stdout: Volume group "ceph-fcd8ebc9-a413-40fe-bcff-23dee2b41a14" successfully created
/bin/podman: stderr Running command: /usr/sbin/lvcreate --yes -l 5119 -n osd-block-0b4ee397-5be6-4bc2-a174-fcb3c1ef59f3 ceph-fcd8ebc9-a413-40fe-bcff-23dee2b41a14
/bin/podman: stderr  stderr: selabel_open failed: No such file or directory
/bin/podman: stderr  stderr: selabel_open failed: No such file or directory
/bin/podman: stderr  stderr: selabel_open failed: No such file or directory
/bin/podman: stderr  stderr: selabel_open failed: No such file or directory
/bin/podman: stderr   selabel_open failed: No such file or directory
/bin/podman: stderr  stdout: Logical volume "osd-block-0b4ee397-5be6-4bc2-a174-fcb3c1ef59f3" created.
/bin/podman: stderr Running command: /usr/bin/ceph-authtool --gen-print-key
/bin/podman: stderr Running command: /usr/bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-0
/bin/podman: stderr Running command: /usr/sbin/restorecon /var/lib/ceph/osd/ceph-0
/bin/podman: stderr  stderr: No such file or directory
/bin/podman: stderr --> Was unable to complete a new OSD, will rollback changes
/bin/podman: stderr Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd purge-new osd.0 --yes-i-really-mean-it
/bin/podman: stderr  stderr: purged osd.0

Version-Release number of selected component (if applicable):
Container: ceph-5.0-rhel-8-containers-candidate-26800-20210129231628
Compose Id: RHCEPH-5.0-RHEL-8-20210129.ci.8

Ceph Version: ceph version 16.1.0-100.el8cp (fd37c928e824870f3b214b12828a3d8f9d1ebbc1) pacific (rc

How reproducible:


Steps to Reproduce:
1. Deploy custom image
2. Install cephadm
3. Bootstrap
4. `ceph orch apply osd --all-available-devices`

Actual results:
Apply commands indicates the OSDS are scheduled.

Expected results:
Ceph cluster status should be healthy


Additional info:
http://pastebin.test.redhat.com/936131

Installer node: 10.0.102.52 cephuser/cephuser

Second Run
http://magna002.ceph.redhat.com/ceph-qe-logs/psathyan/rhcs/feb1/
Manager log: http://magna002.ceph.redhat.com/ceph-qe-logs/psathyan/rhcs/feb1/mgr.log

Comment 1 Preethi 2021-02-02 05:02:49 UTC
@Juan, We are seeing ceph orch osd add failures in Teuthology runs also. 

http://magna002.ceph.redhat.com/pnataraj-2021-02-01_14:05:06-smoke:cephadm-master-distro-basic-clara/395962/tasks/rhcephadm-1.log

Output snippet:
2021-02-01T14:22:05.076 INFO:teuthology.orchestra.run.clara001.stderr:--> Zapping successful for: <Raw Device: /dev/sdd>
2021-02-01T14:22:05.225 DEBUG:teuthology.orchestra.run.clara001:> sudo /home/ubuntu/cephtest/cephadm --image registry-proxy.engineering.redhat.com/rh-osbs/rhceph:ceph-5.0-rhel-8-containers-candidate-26800-20210129231628 shell -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring --fsid f7fad68c-64c1-11eb-95d0-002590fc2776 -- ceph orch daemon add osd clara001:/dev/sdd
2021-02-01T14:22:06.231 INFO:journalctl.z.clara003.stdout:Feb 01 19:16:39 clara003 systemd[1]: Starting Ceph mgr.z for f7fad68c-64c1-11eb-95d0-002590fc2776...
2021-02-01T14:22:07.481 INFO:journalctl.z.clara003.stdout:Feb 01 19:16:40 clara003 bash[13769]: d99e4e2c26ea48e84299b761d86be9f701296c32bc1d11bd276ea2a06c6305e1
2021-02-01T14:22:07.481 INFO:journalctl.z.clara003.stdout:Feb 01 19:16:40 clara003 systemd[1]: Started Ceph mgr.z for f7fad68c-64c1-11eb-95d0-002590fc2776.
2021-02-01T14:22:09.233 INFO:journalctl.y.clara014.stdout:Feb 01 19:16:42 clara014 systemd[1]: Starting Ceph mgr.y for f7fad68c-64c1-11eb-95d0-002590fc2776...
2021-02-01T14:22:10.233 INFO:journalctl.y.clara014.stdout:Feb 01 19:16:43 clara014 bash[13324]: 4b80975f94699ae0abcba5bb1d2b62e3dda0f716212dfe171eda50ef8cb51553
2021-02-01T14:22:10.234 INFO:journalctl.y.clara014.stdout:Feb 01 19:16:43 clara014 systemd[1]: Started Ceph mgr.y for f7fad68c-64c1-11eb-95d0-002590fc2776.
2021-02-01T14:22:10.959 INFO:teuthology.orchestra.run.clara001.stderr:Error EINVAL: Traceback (most recent call last):
2021-02-01T14:22:10.960 INFO:teuthology.orchestra.run.clara001.stderr:  File "/usr/share/ceph/mgr/mgr_module.py", line 1269, in _handle_command
2021-02-01T14:22:10.960 INFO:teuthology.orchestra.run.clara001.stderr:    return self.handle_command(inbuf, cmd)
2021-02-01T14:22:10.961 INFO:teuthology.orchestra.run.clara001.stderr:  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 150, in handle_command
2021-02-01T14:22:10.961 INFO:teuthology.orchestra.run.clara001.stderr:    return dispatch[cmd['prefix']].call(self, cmd, inbuf)
2021-02-01T14:22:10.962 INFO:teuthology.orchestra.run.clara001.stderr:  File "/usr/share/ceph/mgr/mgr_module.py", line 380, in call
2021-02-01T14:22:10.963 INFO:teuthology.orchestra.run.clara001.stderr:    return self.func(mgr, **kwargs)
2021-02-01T14:22:10.963 INFO:teuthology.orchestra.run.clara001.stderr:  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 108, in <lambda>
2021-02-01T14:22:10.964 INFO:teuthology.orchestra.run.clara001.stderr:    wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs)
2021-02-01T14:22:10.964 INFO:teuthology.orchestra.run.clara001.stderr:  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 97, in wrapper
2021-02-01T14:22:10.965 INFO:teuthology.orchestra.run.clara001.stderr:    return func(*args, **kwargs)
2021-02-01T14:22:10.965 INFO:teuthology.orchestra.run.clara001.stderr:  File "/usr/share/ceph/mgr/orchestrator/module.py", line 823, in _daemon_add_osd
2021-02-01T14:22:10.966 INFO:teuthology.orchestra.run.clara001.stderr:    raise_if_exception(completion)
2021-02-01T14:22:10.967 INFO:teuthology.orchestra.run.clara001.stderr:  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 652, in raise_if_exception
2021-02-01T14:22:10.967 INFO:teuthology.orchestra.run.clara001.stderr:    raise e
2021-02-01T14:22:10.968 INFO:teuthology.orchestra.run.clara001.stderr:RuntimeError: cephadm exited with an error code: 1, stderr:/bin/podman: stderr --> passed data devices: 1 physical, 0 LVM
2021-02-01T14:22:10.968 INFO:teuthology.orchestra.run.clara001.stderr:/bin/podman: stderr --> relative data size: 1.0
2021-02-01T14:22:10.969 INFO:teuthology.orchestra.run.clara001.stderr:/bin/podman: stderr Running command: /usr/bin/ceph-authtool --gen-print-key
2021-02-01T14:22:10.969 INFO:teuthology.orchestra.run.clara001.stderr:/bin/podman: stderr Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 0ea76097-69f6-4e59-8a09-c94192fa6388
2021-02-01T14:22:10.970 INFO:teuthology.orchestra.run.clara001.stderr:/bin/podman: stderr Running command: /usr/sbin/vgcreate --force --yes ceph-e9d1b22e-afec-406a-8689-82e7b6237e17 /dev/sdd
2021-02-01T14:22:10.971 INFO:teuthology.orchestra.run.clara001.stderr:/bin/podman: stderr  stderr: selabel_open failed: No such file or directory
2021-02-01T14:22:10.971 INFO:teuthology.orchestra.run.clara001.stderr:/bin/podman: stderr   selabel_open failed: No such file or directory
2021-02-01T14:22:10.972 INFO:teuthology.orchestra.run.clara001.stderr:/bin/podman: stderr  stderr: selabel_open failed: No such file or directory
2021-02-01T14:22:10.972 INFO:teuthology.orchestra.run.clara001.stderr:/bin/podman: stderr  stdout: Physical volume "/dev/sdd" successfully created.
2021-02-01T14:22:10.973 INFO:teuthology.orchestra.run.clara001.stderr:/bin/podman: stderr  stdout: Volume group "ceph-e9d1b22e-afec-406a-8689-82e7b6237e17" successfully created
2021-02-01T14:22:10.973 INFO:teuthology.orchestra.run.clara001.stderr:/bin/podman: stderr Running command: /usr/sbin/lvcreate --yes -l 57234 -n osd-block-0ea76097-69f6-4e59-8a09-c94192fa6388 ceph-e9d1b22e-afec-406a-8689-82e7b6237e17
2021-02-01T14:22:10.974 INFO:teuthology.orchestra.run.clara001.stderr:/bin/podman: stderr  stderr: selabel_open failed: No such file or directory
2021-02-01T14:22:10.974 INFO:teuthology.orchestra.run.clara001.stderr:/bin/podman: stderr  stderr: selabel_open failed: No such file or directory
2021-02-01T14:22:10.975 INFO:teuthology.orchestra.run.clara001.stderr:/bin/podman: stderr  stderr: Volume group "ceph-e9d1b22e-afec-406a-8689-82e7b6237e17" has insufficient free space (57233 extents): 57234 required.
2021-02-01T14:22:10.976 INFO:teuthology.orchestra.run.clara001.stderr:/bin/podman: stderr --> Was unable to complete a new OSD, will rollback changes
2021-02-01T14:22:10.976 INFO:teuthology.orchestra.run.clara001.stderr:/bin/podman: stderr Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd purge-new osd.0 --yes-i-really-mean-it
2021-02-01T14:22:10.977 INFO:teuthology.orchestra.run.clara001.stderr:/bin/podman: stderr  stderr: purged osd.0
2021-02-01T14:22:10.977 INFO:teuthology.orchestra.run.clara001.stderr:/bin/podman: stderr -->  RuntimeError: command returned non-zero exit status: 5
2021-02-01T14:22:10.978 INFO:teuthology.orchestra.run.clara001.stderr:Traceback (most recent call last):
2021-02-01T14:22:10.978 INFO:teuthology.orchestra.run.clara001.stderr:  File "<stdin>", line 7582, in <module>
2021-02-01T14:22:10.979 INFO:teuthology.orchestra.run.clara001.stderr:  File "<stdin>", line 7571, in main
2021-02-01T14:22:10.980 INFO:teuthology.orchestra.run.clara001.stderr:  File "<stdin>", line 1566, in _infer_fsid
2021-02-01T14:22:10.980 INFO:teuthology.orchestra.run.clara001.stderr:  File "<stdin>", line 1650, in _infer_image
2021-02-01T14:22:10.981 INFO:teuthology.orchestra.run.clara001.stderr:  File "<stdin>", line 4180, in command_ceph_volume
2021-02-01T14:22:10.981 INFO:teuthology.orchestra.run.clara001.stderr:  File "<stdin>", line 1329, in call_throws
2021-02-01T14:22:10.982 INFO:teuthology.orchestra.run.clara001.stderr:RuntimeError: Failed command: /bin/podman run --rm --ipc=host --authfile=/etc/ceph/podman-auth.json --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk -e CONTAINER_IMAGE=registry-proxy.engineering.redhat.com/rh-osbs/rhceph:ceph-5.0-rhel-8-containers-candidate-26800-20210129231628 -e NODE_NAME=clara001 -e CEPH_VOLUME_OSDSPEC_AFFINITY=None -v /var/run/ceph/f7fad68c-64c1-11eb-95d0-002590fc2776:/var/run/ceph:z -v /var/log/ceph/f7fad68c-64c1-11eb-95d0-002590fc2776:/var/log/ceph:z -v /var/lib/ceph/f7fad68c-64c1-11eb-95d0-002590fc2776/crash:/var/lib/ceph/crash:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /tmp/ceph-tmpdcyn5du_:/etc/ceph/ceph.conf:z -v /tmp/ceph-tmpq_98hn68:/var/lib/ceph/bootstrap-osd/ceph.keyring:z registry-proxy.engineering.redhat.com/rh-osbs/rhceph:ceph-5.0-rhel-8-containers-candidate-26800-20210129231628 lvm batch --no-auto /dev/sdd --yes --no-systemd
2021-02-01T14:22:10.982 INFO:teuthology.orchestra.run.clara001.stderr:
2021-02-01T14:22:11.060 DEBUG:teuthology.orchestra.run:got remote process result: 22

Comment 6 Juan Miguel Olmo 2021-02-03 09:38:43 UTC
I think that maybe the problem is that you have defined partitions in your disks:

Take a look to:
https://kerneltalks.com/troubleshooting/pvcreate-error-device-not-found-or-ignored-by-filtering

Comment 8 Vasishta 2021-02-04 07:38:46 UTC
Hi Juan,

I'm facing same issue when I tried manually with ceph version 16.1.0-100.el8cp
But I'm quite sure that there were no partitions when OSD deployment was tried. But selinux was in permissive mode.

Regards,
Vasishta Shastry
QE, Ceph

Comment 9 Juan Miguel Olmo 2021-02-05 17:47:19 UTC
Working with this the whole week. It seems clear what is the exact point where the problem happens but no idea about the root cause. Summarizing the facts collected

Environment:
Red Hat Enterprise Linux release 8.3 (Ootpa)
ceph version 16.1.0-100.el8cp (fd37c928e824870f3b214b12828a3d8f9d1ebbc1) pacific (rc) deployed using cephadm with image:
registry-proxy.engineering.redhat.com/rh-osbs/rhceph:ceph-5.0-rhel-8-containers-candidate-93344-20210201212920

1. The error happens in a ceph-volume container launched with this command:

/bin/podman run --rm --ipc=host --authfile=/etc/ceph/podman-auth.json --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk -e CONTAINER_IMAGE=registry-proxy.engineering.redhat.com/rh-osbs/rhceph:ceph-5.0-rhel-8-containers-candidate-26800-20210129231628 -e NODE_NAME=ceph-prag00-1612189570966-node5-osdrgw -e CEPH_VOLUME_OSDSPEC_AFFINITY=all-available-devices -v /var/log/ceph/a405fd0c-649b-11eb-b73c-fa163e24d39d:/var/log/ceph:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /tmp/ceph-tmphqbvmlhb:/etc/ceph/ceph.conf:z -v /tmp/ceph-tmp26myoi65:/var/lib/ceph/bootstrap-osd/ceph.keyring:z registry-proxy.engineering.redhat.com/rh-osbs/rhceph:ceph-5.0-rhel-8-containers-candidate-26800-20210129231628 lvm batch --no-auto /dev/vdb /dev/vdc /dev/vdd /dev/vde --yes --no-systemd

2. The error happens when ceph-volume is executing the lvm prepare stuff:

/usr/bin/ceph-authtool --gen-print-key
/usr/bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-0
/usr/sbin/restorecon /var/lib/ceph/osd/ceph-0   <---------- offending command: Note that the folder has been use in the previous command to mount a tmpfs without problems
No such file or directory

3. No error traces or selinux violation errors found in /var/log/messages, or using selinux ausearch/aureport

4. The problem only happens with rhcs5 latest alpha. Differences with the previous alpha release that can affect this problem are only the way we launch the command. 
In the previous alpha we used two mapped folders more:
-v /var/run/ceph/076b13be-6794-11eb-aa9c-fa163ef8fbaa:/var/run/ceph:z 
-v /var/lib/ceph/076b13be-6794-11eb-aa9c-fa163ef8fbaa/crash:/var/lib/ceph/crash:z\
Introducing this change in  the ceph-volume container launched with latest alpha .. the problem continues present

5. Upstream pacific does not shown this problem.In theutology we have tests running with selinux permissive mode.

6. The problem disappears if selinux is disabled completely ( in "permissive" mode it also does not work)

7. Launching a "ceph shell"  container (selinux enforce mode by default). After creating a folder, we can use the restorecon command and it fails exactly in the same way:

[ceph: root@ceph-prag01-1612516164430-node1-mon-mgr-installer-osd /]# mkdir -p /var/lib/ceph/osd/osdtest
[ceph: root@ceph-prag01-1612516164430-node1-mon-mgr-installer-osd /]# /usr/sbin/restorecon /var/lib/ceph/osd/osdtest
No such file or directory
[ceph: root@ceph-prag01-1612516164430-node1-mon-mgr-installer-osd /]# ls -lZ /var/lib/ceph/osd
total 0
drwxr-xr-x. 2 root root system_u:object_r:container_file_t:s0:c72,c418 6 Feb  5 17:26 osdtest

Comment 10 Juan Miguel Olmo 2021-02-08 11:43:52 UTC
As stated in the previous comment, the problem is that in containers running the last alpha release with selinux in enforce/permissive mode, the selinux <restorecon> command is not working.

I have verified the differences in the "podman run" command used to launch the container between the previous alpha ( where it works) and the last one (where it does not work), an is only a couple of mapped folders. Atfter add these two missing folder from the previous alpha, the behavior is the same it continues without work

Do we have any change in the Dockerfile used to create the latest alpha image?


Example:
Running a "ceph shell" container using "registry-proxy.engineering.redhat.com/rh-osbs/rhceph:ceph-5.0-rhel-8-containers-candidate-93344-20210201212920"

In enforce mode

# sestatus
SELinux status:                 enabled
SELinuxfs mount:                /sys/fs/selinux
SELinux root directory:         /etc/selinux
Loaded policy name:             targeted
Current mode:                   enforcing
Mode from config file:          enforcing
Policy MLS status:              enabled
Policy deny_unknown status:     allowed
Memory protection checking:     actual (secure)
Max kernel policy version:      32
# mkdir /tmp/testenforcing
# ls -lZ /tmp
total 4
-rwx------. 1 root root system_u:object_r:container_file_t:s0:c482,c590 701 Dec 10 01:53 ks-script-_5wnwxun
drwxr-xr-x. 2 root root system_u:object_r:container_file_t:s0:c482,c590   6 Feb  8 11:29 testenforcing
# /usr/sbin/restorecon /tmp/testenforcing
No such file or directory
# /usr/sbin/restorecon /tmp/testenforcing/
No such file or directory
# restorecon /tmp/testenforcing 
No such file or directory


In permissive mode:

# sestatus
SELinux status:                 enabled
SELinuxfs mount:                /sys/fs/selinux
SELinux root directory:         /etc/selinux
Loaded policy name:             targeted
Current mode:                   permissive
Mode from config file:          enforcing
Policy MLS status:              enabled
Policy deny_unknown status:     allowed
Memory protection checking:     actual (secure)
Max kernel policy version:      32
# mkdir testpermmissive
# ls -lZ /tmp 
total 4
-rwx------. 1 root root system_u:object_r:container_file_t:s0:c482,c590 701 Dec 10 01:53 ks-script-_5wnwxun
drwxr-xr-x. 2 root root system_u:object_r:container_file_t:s0:c482,c590   6 Feb  8 11:26 testpermmissive
# restorecon /tmp/testpermmissive/
No such file or directory
# restorecon /tmp/testpermmissive 
No such file or directory
# restorecon
usage:  restorecon [-iIDFmnprRv0] [-e excludedir] pathname...
usage:  restorecon [-iIDFmnprRv0] [-e excludedir] -f filename

Comment 11 Boris Ranto 2021-02-08 13:37:13 UTC
The selinux-policy-targetted package is not installed inside the container so restorecon (and dnf itself as well) is failing. I believe this happens because we introduced a new option to dnf/yum in RHCS 5.0:

$ yum install -y --setopt=install_weak_deps=False

Comment 12 Ken Dreyer (Red Hat) 2021-02-08 21:49:43 UTC
registry-proxy.engineering.redhat.com/rh-osbs/rhceph:ceph-5.0-rhel-8-containers-candidate-17877-20210129182254 has selinux-policy-targeted . I can bring up a cluster with cephadm with this image.

registry-proxy.engineering.redhat.com/rh-osbs/rhceph:ceph-5.0-rhel-8-containers-candidate-93344-20210201212920 has selinux-policy-minimum . When I run cephadm with this image, it fails with 
No such file or directory" errors.

The particular selinux-policy-* change is a result of the Dockerfile change in bug 1921807 to set install_weak_deps=False like Boris mentioned.

When I run restorecon through strace, I see that it's hitting ENOENT when opening some paths under "/etc/selinux/targeted/". The reason restorecon looks there is because the container image has SELINUXTYPE=targeted in /etc/selinux/config. Applications like "dnf" or "restorecon" fail with that setting. When I run `sed -i 's/SELINUXTYPE=targeted/SELINUXTYPE=minimum/' /etc/selinux/config`, the applications work inside the container.

Possible ways forward:

A) Always install the selinux-policy-targeted package inside the container image.

B) Set SELINUXTYPE=minimum in /etc/selinux/config.

Option A brings us back to a more well-tested configuration. Option B keeps the size of the image smaller, which was the original goal in bz 1921807. Dimitri, what is your preference?

Comment 13 Ken Dreyer (Red Hat) 2021-02-08 22:32:43 UTC
One thing that puzzles me is that RH Ceph Storage 4 has had selinux-policy-minimum (and SELINUXTYPE=targeted) for forever. It's not clear to me why we don't see this error there.

Comment 14 Ken Dreyer (Red Hat) 2021-02-09 00:04:10 UTC
Maybe the reason we never hit this on RHCS 4 is that ceph-ansible does not bind-mount /sys into the container. I cannot trigger this bug without mounting the host's /sys inside the container. For example:

podman run -it --rm --privileged -v /tmp:/tmp -v /sys:/sys --entrypoint /bin/bash registry.redhat.io/rhceph/rhceph-4-rhel8
touch /tmp/foo.txt
restorecon -v /tmp/foo.txt 
No such file or directory

Comment 15 Ken Dreyer (Red Hat) 2021-02-09 00:27:52 UTC
Here is the weak dependency chain in the container image for RHCS 5:

ceph-mgr-cephadm
 └─ cephadm
     └─ podman
         └─ container-selinux
             └─ selinux-policy-targeted

When we set install_weak_deps=False, we no longer install podman in the container, which stops pulling in container-selinux and selinux-policy-targeted, so we go back to having selinux-policy-minimal like in RHCS 4.

Comment 16 Ken Dreyer (Red Hat) 2021-02-09 00:36:06 UTC
(In reply to myself from comment #12)

Other possible options:

C) Stop mounting the host's /sys into the container (like RHCS 4)

D) Stop calling restorecon in the container (see https://github.com/ceph/ceph/pull/31421 for discussion about this feature)(In reply to Ken Dreyer (Red Hat) from comment #12)

Comment 17 Juan Miguel Olmo 2021-02-09 10:53:22 UTC
(In reply to Ken Dreyer (Red Hat) from comment #16)
> (In reply to myself from comment #12)
> 
> Other possible options:
> 
> C) Stop mounting the host's /sys into the container (like RHCS 4)
> 
> D) Stop calling restorecon in the container (see
> https://github.com/ceph/ceph/pull/31421 for discussion about this
> feature)(In reply to Ken Dreyer (Red Hat) from comment #12)

restorecon is not called if selinux is disabled or not installed in the container:
https://github.com/ceph/ceph/blob/d5290aad466bf5707fcc3753e850e39362cb2668/src/ceph-volume/ceph_volume/util/system.py#L308

Comment 18 Juan Miguel Olmo 2021-02-09 11:11:07 UTC
My view is that we need to decide if we want to have selinux operative in the downstream ceph image or not. I am not an expert in security but I think that in any case we do not know the things we will like to do in the future, so having selinux completely operations does not seem a bad idea.

 With this in mind, I comment briefly the options provided by Ken:

A) Always install the selinux-policy-targeted package inside the container image.
In my view is the more secure option. This will produce a bigger image but safer (in my view 100 Mb+ does not seem a great problem in XXI century), and we have selinux completely operational, and we avoid completely the possibility of other "weird" and difficult to debug errors in the future

B) Set SELINUXTYPE=minimum in /etc/selinux/config.
This does no fix the problem... we have a selinux installation with potential problems

C) Stop mounting the host's /sys into the container (like RHCS 4)
This does no fix the problem... we have a selinux installation with potential problems

D) Stop calling restorecon in the container
We do not call it if selinux is disabled, the problem is because selinux is enabled but not fully operational


So.. I vote for A) and revert the "--setopt=install_weak_deps=False" introduced in the Dockerfile.

NOTE: This problem is blocking testing features in downstream, so I would kindly ask for a quick decision about the solution.

Comment 19 Juan Miguel Olmo 2021-02-09 11:14:20 UTC
Including Ana to bring more light about what is the best solution possible from the security point of view.

Comment 21 Boris Ranto 2021-02-10 11:43:53 UTC
I believe the preferred (correct) solution would be to stop mounting /sys in cephadm (or mount a subset of it). That is why the system inside the container gets confused and thinks that it is supposed to adhere to the host system (targeted) SELinux policy instead of the one that is available inside the container (minimal).

If we simply disable the restorecon calls, there are still issues with other commands like dnf/yum itself which is unable to install anything because the targeted policy SELinux files are missing.

Installing the targeted policy package is more of a workaround as we get the system inside the container confused into thinking it is a host system (with targeted policy) since we mount the /sys dir.

As long as config file modification goes, we would have to modify /etc/selinux/config on the host system to change the type to minimal which is really a no-go (I hope we don't mount the entire /etc inside the container; if we do we should also stop doing that and only mount the appropriate config files from /etc).

Comment 22 Ken Dreyer (Red Hat) 2021-02-10 15:10:08 UTC
Sage added the /sys mount in https://github.com/ceph/ceph/commit/3ccab99d15e6498b949eca8f133fb3b947c7b629

Some other projects work around this by mounting -v /usr/share/empty:/sys/fs/selinux. It does seem to fix the problem for me.

podman run -it --rm -v /usr/share/empty:/sys/fs/selinux --privileged --entrypoint /bin/bash registry-proxy.engineering.redhat.com/rh-osbs/rhceph:ceph-5.0-rhel-8-containers-candidate-93344-20210201212920
[root@925e9ef72e23 /]# touch /foo.txt
[root@925e9ef72e23 /]# restorecon -v /foo.txt

Looks like this would be a one-line change to cephadm's mounts lists for the OSD.

Comment 23 Boris Ranto 2021-02-10 16:45:34 UTC
If we need to mount /sys inside the container then yes, this looks like the correct approach to me.

btw: restorecon seems to be a no-op with /sys/fs/selinux bind mounted to an empty dir. dnf also works in such a system, i.e. -v /usr/share/empty:/sys/fs/selinux looks like a way to go, here.

Comment 24 Juan Miguel Olmo 2021-02-11 09:14:02 UTC
The fix adopted add the proposed mount to osds containers but besides that it seems that we need a modification in the container image:

see https://github.com/ceph/ceph/pull/39398#issuecomment-777152910

Where is set this change?

Comment 25 Ken Dreyer (Red Hat) 2021-02-11 17:27:41 UTC
I ran scratch builds yesterday with this change, and I was able to deploy a cluster with the cephadm suite of https://github.com/red-hat-storage/cephci .

Thanks for the follow-on fix in https://github.com/ceph/ceph/pull/39424

When we've backported the fixes to pacific, we'll take it into the downstream for the QE team.

Comment 35 errata-xmlrpc 2021-08-30 08:28:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 5.0 bug fix and enhancement), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3294


Note You need to log in before you can comment on or make changes to this bug.