Bug 1820798

Summary: systemd-makefs does not have the correct selinux context type
Product: Red Hat Enterprise Linux 8 Reporter: Derrick Ornelas <dornelas>
Component: selinux-policyAssignee: Zdenek Pytela <zpytela>
Status: CLOSED ERRATA QA Contact: Milos Malik <mmalik>
Severity: medium Docs Contact:
Priority: urgent    
Version: 8.1CC: dcain, dornelas, fkrska, kholtz, kwalker, lvrabec, mmalik, mnguyen, msekleta, plautrba, ssekidde, systemd-maint-list, umesh_sunnapu
Target Milestone: rcKeywords: Triaged, ZStream
Target Release: 8.3Flags: dornelas: mirror+
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1859162 (view as bug list) Environment:
Last Closed: 2020-11-04 01:56:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1186913, 1846368, 1859162    
Attachments:
Description Flags
Multi disk steps none

Description Derrick Ornelas 2020-04-03 22:25:12 UTC
Description of problem:

systemd-makefs (and systemd-growfs) do not appear to ship with the correct selinux context type.  The current type, init_exec_t, does not allow read/write access to nvme devices, for example.  This causes the x-systemd.mkfs option in fstab to fail when used with nvme devices.  


Version-Release number of selected component (if applicable):

selinux-policy-3.14.3-20.el8
systemd-239-18.el8_1.2.x86_64
systemd-udev-239-18.el8_1.2.x86_64


How reproducible: 100%


Steps to Reproduce:
1. Create test mount point

  # mkdir /mnt/test


2. Edit /etc/fstab and add a new nvme disk entry, such as:

  /dev/nvme1n1	/mnt/test	xfs	x-systemd.makefs		0 0


3.  Reload systemd

  # systemctl daemon-reload


4.  Start the new mount unit

  # systemctl start mnt-test.mount


Actual results:

# systemctl start mnt-test.mount 
A dependency job for mnt-test.mount failed. See 'journalctl -xe' for details.


Apr 03 17:17:15 test systemd-makefs[8246]: Failed to probe "/dev/nvme1n1": Permission denied
Apr 03 17:17:15 test systemd[1]: systemd-mkfs: Failed with result 'exit-code'.


Expected results:

The /dev/nvme1n1 disk is mounted to /mnt/test with a new XFS filesystem


Additional info:

This was originally encountered in AWS.  My test system doesn't have a nvme drive, but this is easily reproduced with the nvme_device_t context type applied


# semanage fcontext -l | egrep '/dev/\[s.*d|/dev/nvme'
/dev/[shmxv]d[^/]*                                 block device       system_u:object_r:fixed_disk_device_t:s0 
/dev/nvme.*                                        block device       system_u:object_r:nvme_device_t:s0 
/dev/nvme.*                                        character device   system_u:object_r:nvme_device_t:s0 


# chcon -t  nvme_device_t /dev/sdb

# mkdir /mnt/test

# systemctl daemon-reload 

# systemctl start mnt-test.mount 
A dependency job for mnt-test.mount failed. See 'journalctl -xe' for details.


# journalctl --since -3m
-- Logs begin at Tue 2020-03-03 15:09:48 EST, end at Fri 2020-04-03 17:28:21 EDT. --
Apr 03 17:28:20 test.example.com systemd[1]: Created slice system-systemd\x2dmkfs.slice.
Apr 03 17:28:20 test.example.com systemd[1]: Starting Make File System on /dev/sdb...
Apr 03 17:28:20 test.example.com systemd-makefs[31518]: Failed to probe "/dev/sdb": Permission denied
Apr 03 17:28:20 test.example.com systemd[1]: systemd-mkfs: Main process exited, code=exited, status=1/FAILURE
Apr 03 17:28:20 test.example.com systemd[1]: systemd-mkfs: Failed with result 'exit-code'.
Apr 03 17:28:20 test.example.com systemd[1]: Failed to start Make File System on /dev/sdb.
Apr 03 17:28:20 test.example.com systemd[1]: Dependency failed for /mnt/test.
Apr 03 17:28:20 test.example.com systemd[1]: mnt-test.mount: Job mnt-test.mount/start failed with result 'dependency'.
Apr 03 17:28:20 test.example.com dbus-daemon[796]: [system] Activating service name='org.fedoraproject.Setroubleshootd' requested by ':1.52' (uid=0 pid=768 comm="/usr/sbin/sedispatch " label="system_u:system_r:auditd_t:s0") (using servicehelper)
Apr 03 17:28:21 test.example.com dbus-daemon[796]: [system] Successfully activated service 'org.fedoraproject.Setroubleshootd'
Apr 03 17:28:21 test.example.com setroubleshoot[31522]: SELinux is preventing /usr/lib/systemd/systemd-makefs from read access on the blk_file sdb. For complete SELinux messages run: sealert -l 4dedd5ea-265a-4016-9327-7c9e4cfcf539
Apr 03 17:28:21 test.example.com platform-python[31522]: SELinux is preventing /usr/lib/systemd/systemd-makefs from read access on the blk_file sdb.
                                                                           
                                                                           *****  Plugin catchall (100. confidence) suggests   **************************
                                                                           
                                                                           If you believe that systemd-makefs should be allowed read access on the sdb blk_file by default.
                                                                           Then you should report this as a bug.
                                                                           You can generate a local policy module to allow this access.
                                                                           Do
                                                                           allow this access for now by executing:
                                                                           # ausearch -c 'systemd-makefs' --raw | audit2allow -M my-systemdmakefs
                                                                           # semodule -X 300 -i my-systemdmakefs.pp



# ls -Z /usr/lib/systemd/systemd-makefs 
system_u:object_r:init_exec_t:s0 /usr/lib/systemd/systemd-makefs


# sesearch --allow -s init_t -t nvme_device_t
allow init_t device_node:blk_file { getattr relabelfrom relabelto };
allow init_t device_node:chr_file { create getattr relabelfrom relabelto };
allow init_t device_node:dir { getattr relabelfrom relabelto };
allow init_t device_node:fifo_file { getattr relabelfrom relabelto };
allow init_t device_node:file { getattr relabelfrom relabelto };
allow init_t device_node:lnk_file { getattr relabelfrom relabelto };
allow init_t device_node:sock_file { getattr relabelfrom relabelto };



I went back and forth on whether to open this against selinux-policy or systemd, but I think it may be expected that tools that need to admin filesystems should have an fsadm context type

# ls -lZ /usr/lib/systemd/{systemd-fsck,systemd-makefs,systemd-growfs} /usr/sbin/mkfs*
-rwxr-xr-x. 1 root root system_u:object_r:fsadm_exec_t:s0  31176 Nov 29 06:45 /usr/lib/systemd/systemd-fsck
-rwxr-xr-x. 1 root root system_u:object_r:init_exec_t:s0   26416 Nov 29 06:45 /usr/lib/systemd/systemd-growfs
-rwxr-xr-x. 1 root root system_u:object_r:init_exec_t:s0   18208 Nov 29 06:45 /usr/lib/systemd/systemd-makefs
-rwxr-xr-x. 1 root root system_u:object_r:fsadm_exec_t:s0  24208 Sep 21  2019 /usr/sbin/mkfs
-rwxr-xr-x. 1 root root system_u:object_r:bin_t:s0         58624 Sep 21  2019 /usr/sbin/mkfs.cramfs
-rwxr-xr-x. 4 root root system_u:object_r:fsadm_exec_t:s0 167200 May 29  2019 /usr/sbin/mkfs.ext2
-rwxr-xr-x. 4 root root system_u:object_r:fsadm_exec_t:s0 167200 May 29  2019 /usr/sbin/mkfs.ext3
-rwxr-xr-x. 4 root root system_u:object_r:fsadm_exec_t:s0 167200 May 29  2019 /usr/sbin/mkfs.ext4
-rwxr-xr-x. 1 root root system_u:object_r:fsadm_exec_t:s0  40008 Feb 22  2019 /usr/sbin/mkfs.fat
-rwxr-xr-x. 1 root root system_u:object_r:fsadm_exec_t:s0 121696 Sep 21  2019 /usr/sbin/mkfs.minix
lrwxrwxrwx. 1 root root system_u:object_r:bin_t:s0             8 Feb 22  2019 /usr/sbin/mkfs.msdos -> mkfs.fat
lrwxrwxrwx. 1 root root system_u:object_r:bin_t:s0             8 Feb 22  2019 /usr/sbin/mkfs.vfat -> mkfs.fat
-rwxr-xr-x. 1 root root system_u:object_r:fsadm_exec_t:s0 583032 May 22  2019 /usr/sbin/mkfs.xfs

Comment 1 Milos Malik 2020-04-06 12:58:05 UTC
Can you collect SELinux denials triggered during "Steps to Reproduce" and attach them here?

# ausearch -m avc -m user_avc -m selinux_err -m user_selinux_err -i -ts today

Thank you.

Comment 2 Derrick Ornelas 2020-04-13 22:44:36 UTC
Here's the denial I just generated from my test system.  It does not have a nvme drive, but this can be replicated using the nvme_device_t context type

# ls -lZ /dev/sdb 
brw-rw----. 1 root disk system_u:object_r:nvme_device_t:s0 8, 16 Apr  3 17:22 /dev/sdb


# ausearch -m avc -m user_avc -m selinux_err -m user_selinux_err -i -ts today
----
type=PROCTITLE msg=audit(04/13/2020 02:33:01.114:4497) : proctitle=/usr/libexec/platform-python /usr/libexec/rhsmcertd-worker 
type=SYSCALL msg=audit(04/13/2020 02:33:01.114:4497) : arch=x86_64 syscall=openat success=no exit=EACCES(Permission denied) a0=0xffffff9c a1=0x5575999e8fc0 a2=O_RDONLY a3=0x0 items=0 ppid=888 pid=19188 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=rhsmcertd-worke exe=/usr/libexec/platform-python3.6 subj=system_u:system_r:rhsmcertd_t:s0 key=(null) 
type=AVC msg=audit(04/13/2020 02:33:01.114:4497) : avc:  denied  { read } for  pid=19188 comm=rhsmcertd-worke name=container-tools.module dev="dm-0" ino=50331779 scontext=system_u:system_r:rhsmcertd_t:s0 tcontext=system_u:object_r:root_t:s0 tclass=file permissive=0 
----
[...more unrelated rhsmcertd-worker denials...]
----
type=PROCTITLE msg=audit(04/13/2020 14:33:01.721:4700) : proctitle=/usr/libexec/platform-python /usr/libexec/rhsmcertd-worker 
type=SYSCALL msg=audit(04/13/2020 14:33:01.721:4700) : arch=x86_64 syscall=openat success=no exit=EACCES(Permission denied) a0=0xffffff9c a1=0x55ff28cd93f0 a2=O_RDONLY a3=0x0 items=0 ppid=888 pid=21016 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=rhsmcertd-worke exe=/usr/libexec/platform-python3.6 subj=system_u:system_r:rhsmcertd_t:s0 key=(null) 
type=AVC msg=audit(04/13/2020 14:33:01.721:4700) : avc:  denied  { read } for  pid=21016 comm=rhsmcertd-worke name=virt.module dev="dm-0" ino=50331781 scontext=system_u:system_r:rhsmcertd_t:s0 tcontext=system_u:object_r:root_t:s0 tclass=file permissive=0 
----
type=PROCTITLE msg=audit(04/13/2020 18:22:25.219:4722) : proctitle=/usr/lib/systemd/systemd-makefs xfs /dev/sdb 
type=SYSCALL msg=audit(04/13/2020 18:22:25.219:4722) : arch=x86_64 syscall=openat success=no exit=EACCES(Permission denied) a0=0xffffff9c a1=0x7ffc62d97f3e a2=O_RDONLY|O_CLOEXEC a3=0x0 items=0 ppid=1 pid=22301 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=systemd-makefs exe=/usr/lib/systemd/systemd-makefs subj=system_u:system_r:init_t:s0 key=(null) 
type=AVC msg=audit(04/13/2020 18:22:25.219:4722) : avc:  denied  { read } for  pid=22301 comm=systemd-makefs name=sdb dev="devtmpfs" ino=40219776 scontext=system_u:system_r:init_t:s0 tcontext=system_u:object_r:nvme_device_t:s0 tclass=blk_file permissive=0 



My (limited) understanding is that, currently, this is expected behavior given that systemd-makefs is labeled as init_exec_t

# sesearch --allow -s init_t -t nvme_device_t
allow init_t device_node:blk_file { getattr relabelfrom relabelto };
allow init_t device_node:chr_file { create getattr relabelfrom relabelto };
allow init_t device_node:dir { getattr relabelfrom relabelto };
allow init_t device_node:fifo_file { getattr relabelfrom relabelto };
allow init_t device_node:file { getattr relabelfrom relabelto };
allow init_t device_node:lnk_file { getattr relabelfrom relabelto };
allow init_t device_node:sock_file { getattr relabelfrom relabelto };


# sesearch --allow -s fsadm_t -t nvme_device_t
allow devices_unconfined_type device_node:blk_file { append audit_access create execmod execute getattr ioctl link lock map mounton open quotaon read relabelfrom relabelto rename setattr swapon unlink write };
allow devices_unconfined_type device_node:chr_file { append audit_access create execute execute_no_trans getattr ioctl link lock map mounton open quotaon read relabelfrom relabelto rename setattr swapon unlink write };
allow devices_unconfined_type device_node:dir getattr;
allow devices_unconfined_type device_node:fifo_file getattr;
allow devices_unconfined_type device_node:file { append audit_access create execute execute_no_trans getattr ioctl link lock map mounton open quotaon read relabelfrom relabelto rename setattr swapon unlink write };
allow devices_unconfined_type device_node:lnk_file { append audit_access create execmod execute getattr ioctl link lock map mounton open quotaon read relabelfrom relabelto rename setattr swapon unlink write };
allow devices_unconfined_type device_node:sock_file getattr;
allow fsadm_t device_node:chr_file getattr;



But init_exec_t/init_t does have read access to other disk devices

# ls -lZ /dev/sda 
brw-rw----. 1 root disk system_u:object_r:fixed_disk_device_t:s0 8, 0 Mar  3 15:09 /dev/sda


# sesearch --allow -s init_t -t fixed_disk_device_t
allow init_t device_node:blk_file { getattr relabelfrom relabelto };
allow init_t device_node:chr_file { create getattr relabelfrom relabelto };
allow init_t device_node:dir { getattr relabelfrom relabelto };
allow init_t device_node:fifo_file { getattr relabelfrom relabelto };
allow init_t device_node:file { getattr relabelfrom relabelto };
allow init_t device_node:lnk_file { getattr relabelfrom relabelto };
allow init_t device_node:sock_file { getattr relabelfrom relabelto };
allow init_t fixed_disk_device_t:blk_file { append getattr ioctl lock open read write };
allow init_t fixed_disk_device_t:chr_file { append getattr ioctl lock open read write };
allow init_t fixed_disk_device_t:lnk_file { getattr read };


so maybe the policy needs to be fixed?

Comment 3 Milos Malik 2020-04-14 08:29:07 UTC
I was looking for something similar to the last AVC you attached. Thank you, Derrick.

Comment 5 Zdenek Pytela 2020-06-25 09:41:30 UTC
I've submitted a Fedora PR to address the issue:
https://github.com/fedora-selinux/selinux-policy/pull/385

Comment 10 umesh_sunnapu 2020-07-06 16:16:20 UTC
Any ETA on this ?

Comment 11 Lukas Vrabec 2020-07-07 10:11:01 UTC
Fix will be part of next minor release of RHEL.

Comment 12 umesh_sunnapu 2020-07-15 01:36:59 UTC
Does this mean the fix will be part of next minor release in RHEL 7.x, 8.x and RHCOS 4.x as well ?

Comment 14 Zdenek Pytela 2020-07-21 05:55:15 UTC
This fix should be a part of RHEL 8.3.

For RHEL 7, no bugzilla is currently open; given the fact the system is now in Maintenance Support 2 Phase, strong justification is needed to meet the criteria for inclusion.

Comment 18 Dave Cain 2020-07-30 19:47:42 UTC
For me in my environment, I've found I'm unable to use an NVMe device (/dev/nvme0n1) for the installation of OpenShift 4.5.  Upon initial boot RHCOS does indeed get written to the NVMe device, but the subsequent boot is where I observe ignition-ostree-mount-var.service exiting with a status=1/FAILURE, with systemd complaining about a Failure to start OSTree Prepare OS/.  

Repeating the same install on the same equipment but against a SATA based SSD for the root disk results in the install moving forward smoothly without issues.

Comment 21 umesh_sunnapu 2020-09-14 20:24:33 UTC
I just tested this on a OCP 4.5 cluster. I see below errors still. This I believe is inline with what we were seeing few months ago

```
Sep 14 20:11:00 r192aw1.oss.labs systemd-makefs[2281]: Failed to probe "/dev/nvme1n1": Permission denied
Sep 14 20:11:00 r192aw1.oss.labs systemd[1]: systemd-mkfs: Main process exited, code=exited, status=1/FAILURE
Sep 14 20:11:00 r192aw1.oss.labs systemd[1]: systemd-mkfs: Failed with result 'exit-code'.
Sep 14 20:11:00 r192aw1.oss.labs systemd[1]: Failed to start Make File System on /dev/nvme1n1.
Sep 14 20:11:00 r192aw1.oss.labs kernel: audit: type=1400 audit(1600114260.292:4): avc:  denied  { read } for  pid=2281 comm="systemd-makefs" name="nvme1n1" dev="devtmpfs" ino=1379 scontext=system_u:system_r:init_t:s0 tcontext=system_u:object_r:nvme_device_t:s0 tclass=blk_file permissive=0
```

Upon checking the node and nvme1n1 disk, I see below selinux permissions

[root@r192aw1 ~]# ls -lZ /dev/nvme1n1
brw-rw----. 1 root disk system_u:object_r:nvme_device_t:s0 259, 1 Sep 14 20:10 /dev/nvme1n1

Attaching the machineconfig and machineconfigpool YAML files ..just FYI

Comment 22 umesh_sunnapu 2020-09-14 20:25:57 UTC
Created attachment 1714850 [details]
Multi disk steps

Comment 23 umesh_sunnapu 2020-09-14 20:27:30 UTC
I would like to know if the fix is already released or do we have to wait for some more time before we test this in OpenShift 4.5.x ?

Comment 24 Milos Malik 2020-09-14 20:51:18 UTC
(In reply to umesh_sunnapu from comment #21)
> I just tested this on a OCP 4.5 cluster. I see below errors still. This I
> believe is inline with what we were seeing few months ago

What version of selinux-policy* packages is installed on the machines, where you see the error?

Comment 25 umesh_sunnapu 2020-09-14 20:57:52 UTC
(In reply to Milos Malik from comment #24)
> (In reply to umesh_sunnapu from comment #21)
> > I just tested this on a OCP 4.5 cluster. I see below errors still. This I
> > believe is inline with what we were seeing few months ago
> 
> What version of selinux-policy* packages is installed on the machines, where
> you see the error?

[root@r192aw1 ~]# rpm -qa | grep selinux-policy*
selinux-policy-3.14.3-41.el8_2.5.noarch
selinux-policy-targeted-3.14.3-41.el8_2.5.noarch

[root@r192aw1 ~]# cat /etc/system-release
Red Hat Enterprise Linux CoreOS release 4.5

Errors were seen when trying to use journalctl

[root@r192aw1 ~]# journalctl --since -60m -r | grep -i 'permission denied' -A 4 -B 2
Sep 14 20:56:20 r192aw1.oss.labs systemd[1]: systemd-mkfs: Failed with result 'exit-code'.
Sep 14 20:56:20 r192aw1.oss.labs systemd[1]: systemd-mkfs: Main process exited, code=exited, status=1/FAILURE
Sep 14 20:56:20 r192aw1.oss.labs systemd-makefs[9445]: Failed to probe "/dev/nvme1n1": Permission denied
Sep 14 20:56:20 r192aw1.oss.labs systemd[1]: kubelet.service: Consumed 141ms CPU time
Sep 14 20:56:20 r192aw1.oss.labs systemd[1]: Stopped MCO environment configuration.
Sep 14 20:56:20 r192aw1.oss.labs systemd[1]: Starting Make File System on /dev/nvme1n1...
Sep 14 20:56:20 r192aw1.oss.labs systemd[1]: kubelet.service: Scheduled restart job, restart counter is at 163.

Let me know if you need any additional information.

Comment 26 Milos Malik 2020-09-15 05:48:28 UTC
According to changelog, the bug is fixed in selinux-policy-3.14.3-41.el8_2.6.

Comment 29 errata-xmlrpc 2020-11-04 01:56:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (selinux-policy bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4528