Description of problem: If you try to start a systemd-nspawn container via "systemctl start systemd-nspawn@{containername}", SELinux will prevent the start due to missing permissions. Note:If you start the container via "systemd-nspawn -b -D /var/lib/machines/{containername}", everything will work fine. Version-Release number of selected component (if applicable): Fedora 33 / systemd 246 & Fedora 32 / systemd 245 How reproducible: permanent, every time Steps to Reproduce: 1. Create a fresh F33 installation and install additional systemd-container 2. Create a directory /var/lib/machines/test and install a container filesystem in it using dnf (e.g. dnf --releasever=33 --best --setopt=install_weak_deps=False --installroot=/var/lib/machines/test/ install dhcp-client dnf fedora-release glibc glibc-langpack-en glibc-langpack-de iproute iputils less passwd systemd vim-minimal) 3. Start the container as system service via systemctl start systemd-nspawn@test Actual results: Start is aborted. SELinux gives a type=ACV message, which can be fixed with the help of e.g. the Cockpit SELinux trouble shooting module. In total 5 different errors are displayed one after the other as follows (back translation from German, the exact wording might be different) 1. SELinux prevents systemd-machine from accessing directory 6419 with search access. type=AVC msg=audit(1606053020.256:704): avc: denied { search } for pid=5657 comm="systemd-machine" name="6419" dev="proc" ino=178102 scontext=system_u:system_r:systemd_machined_t:s0 tcontext=system_u:system_r:unconfined_service_t:s0 tclass=dir permissive=0 2. SELinux prevents systemd-machine from accessing file cgroup with read access. type=AVC msg=audit(1606053303.75:717): avc: denied { read } for pid=5657 comm="systemd-machine" name="cgroup" dev="proc" ino=191900 scontext=system_u:system_r:systemd_machined_t:s0 tcontext=system_u:system_r:unconfined_service_t:s0 tclass=file permissive=0 3. SELinux prevents systemd-machine from using open access to file /proc/<pid>/cgroup. type=AVC msg=audit(1606053438.540:730): avc: denied { open } for pid=5657 comm="systemd-machine" path="/proc/6913/cgroup" dev="proc" ino=203945 scontext=system_u:system_r:systemd_machined_t:s0 tcontext=system_u:system_r:unconfined_service_t:s0 tclass=file permissive=0 After correcting these 3 errors, the container is started, but produces 2 more messages: 4. SELinux prevents systemd-machine from accessing the file /proc/<pid>/cgroup with getattr access. type=AVC msg=audit(1606053606.348:742): avc: denied { getattr } for pid=5657 comm="systemd-machine" path="/proc/7147/cgroup" dev="proc" ino=214182 scontext=system_u:system_r:systemd_machined_t:s0 tcontext=system_u:system_r:unconfined_service_t:s0 tclass=file permissive=0 5. SELinux prevents systemd-machine from accessing /proc/<pid>/cgroup with ioctl access. type=AVC msg=audit(1606053606.348:743): avc: denied { ioctl } for pid=5657 comm="systemd-machine" path="/proc/7147/cgroup" dev="proc" ino=214182 ioctlcmd=0x5401 scontext=system_u:system_r:systemd_machined_t:s0 tcontext=system_u:system_r:unconfined_service_t:s0 tclass=file permissive=0 After correcting these errors as well, further starts are performed without problems. Expected results: Problem-free start out of the box. Additional info: none
(In reply to Peter Boy from comment #0) > Description of problem: > If you try to start a systemd-nspawn container via "systemctl start > systemd-nspawn@{containername}", SELinux will prevent the start due to > missing permissions. Note:If you start the container via "systemd-nspawn -b > -D /var/lib/machines/{containername}", everything will work fine. When started directly from the command line, the process inherits context from the shell; in this case the type is unconfined_t, but for confined users it would be the user's respective type. Machined is allowed access to unconfined_t. When started as a service, the process runs in unconfined_service_t which machined is not allowed to access. # sesearch -A -s systemd_machined_t -t unconfined_t -c dir -p search allow systemd_machined_t userdomain:dir { getattr ioctl lock open read search }; # sesearch -A -s systemd_machined_t -t unconfined_service_t -c dir -p search <> To have complex machine support, we would need confining the service as well as support from systemd to implement direct relabeling on the filesystem or some virtual alternative (virtiofs or namespaces). This is currently not planned. For the selinux part it could possibly be an option to install container-selinux and use the policy from container.
*** Bug 1847545 has been marked as a duplicate of this bug. ***
This message is a reminder that Fedora 33 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora 33 on 2021-11-30. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '33'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 33 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
The problem persist in Fedora 35, please update the bug
Just checked on a new installation of Fedora 35. The bug still exists.
Just checked on a new installation of Fedora 36 and 37/rawhide. The bug still exists.
This message is a reminder that Fedora Linux 35 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora Linux 35 on 2022-12-13. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a 'version' of '35'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, change the 'version' to a later Fedora Linux version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora Linux 35 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora Linux, you are encouraged to change the 'version' to a later version prior to this bug being closed.
I'm updating this to Fedora 37 since Peter reported that it still is broken there.
I tried machinectl on Fedora 38, and these issues are still present. To gather all denials I added the systemd_machined_t context to the permissive types with: semanage permissive -a systemd_machined_t And when I launched the container with: machinectl start CONTAINER I got the usual 5 AVC, as they were initially reported. I also noticed that a simple machinectl list would generate some more: type=AVC msg=audit(1694096510.140:422): avc: denied { search } for pid=1720 comm="systemd-machine" name="16654" dev="proc" ino=74465 scontext=system_u:system_r:systemd_machined_t:s0 tcontext=system_u:system_r:unconfined_service_t:s0 tclass=dir permissive=1 type=AVC msg=audit(1694096510.140:423): avc: denied { read } for pid=1720 comm="systemd-machine" name="mnt" dev="proc" ino=80236 scontext=system_u:system_r:systemd_machined_t:s0 tcontext=system_u:system_r:unconfined_service_t:s0 tclass=lnk_file permissive=1 type=AVC msg=audit(1694096510.140:424): avc: denied { sys_ptrace } for pid=1720 comm="systemd-machine" capability=19 scontext=system_u:system_r:systemd_machined_t:s0 tcontext=system_u:system_r:systemd_machined_t:s0 tclass=cap_userns permissive=1 type=AVC msg=audit(1694096510.141:425): avc: denied { sys_admin } for pid=17439 comm="(sd-osrelns)" capability=21 scontext=system_u:system_r:systemd_machined_t:s0 tcontext=system_u:system_r:systemd_machined_t:s0 tclass=cap_userns permissive=1 So maybe there are more denials waiting to be found. (If you tried the same commands, to revert the semanage permissive command, you can run: semanage permissive -d systemd_machined_t And your system will be back to the previous policy. You can list the permissive types on your machine with: semanage permissive -l )
I can confirm this on F38 with SElinux in the Enforcing mode and /var/lib/machines on BTRFS as host: # dnf --installroot=/var/lib/machines/f38 --releasever 38 install '@Minimal Install' systemd # systemd-nspawn -D /var/lib/machines/f38 -b --private-users=pick --private-users-ownership=chown (boots to the login prompt; kill the container with three ^]'s) # cat > /var/lib/machines/f38.nspawn <<EOF [Exec] Boot=on PrivateUsers=pick [Files] PrivateUsersOwnership=chown EOF # machinectl start f38 Job for systemd-nspawn failed because the control process exited with error code. See "systemctl status systemd-nspawn" and "journalctl -xeu systemd-nspawn" for details. # tail -n 100 /var/log/audit/audit.log| audit2allow #============= systemd_machined_t ============== #!!!! This avc can be allowed using the boolean 'daemons_dump_core' allow systemd_machined_t root_t:dir write; allow systemd_machined_t self:cap_userns kill; allow systemd_machined_t unconfined_service_t:dir search; allow systemd_machined_t unconfined_service_t:file { getattr ioctl open read }; # setenforce 0 # machinectl start f38 (starts as expected)
Here is another set of policy rules which are needed (in my opinion) for successful run of systemd-nspawn in enforcing mode: * * https://src.fedoraproject.org/tests/selinux/blob/main/f/selinux-policy/systemd-machined-and-similar/testpolicy.cil I created a list of them during testing on Fedora so that the following automated test can succeed: * https://src.fedoraproject.org/tests/selinux/blob/main/f/selinux-policy/systemd-machined-and-similar
Problem happens in Fedora 39 per c12. I didn't test the fix. Oct 10 11:00:08 fovo.local audit[1362]: AVC avc: denied { write } for pid=1362 comm="systemd-machine" name="f38" dev="dm-0" ino=257 scontext=system_u:system_r:systemd_machined_t:s0 tcontext=system_u:object_r:root_t:s0 tclass=dir permissive=0
(In reply to Milos Malik from comment #13) > Here is another set of policy rules which are needed (in my opinion) for > successful run of systemd-nspawn in enforcing mode: > * * > https://src.fedoraproject.org/tests/selinux/blob/main/f/selinux-policy/ > systemd-machined-and-similar/testpolicy.cil > > I created a list of them during testing on Fedora so that the following > automated test can succeed: > * > https://src.fedoraproject.org/tests/selinux/blob/main/f/selinux-policy/ > systemd-machined-and-similar I tried the fix by putting it in a file and executed 'semodule -i my-policy.cil' It allows to start a nspawn container as a service, but blocks a 'machinectl shell <container>' as well as a 'machinectl login <container>'. Wasn't it you who provided a fix sometimes ago (sorry if I have remembered this incorrectly). I can't find that fix anymore, somehow. It was: cat my-policy.cil ( allow systemd_machined_t unconfined_service_t ( dir ( search ))) ( allow systemd_machined_t unconfined_service_t ( file ( getattr open read ioctl ))) ( allow systemd_machined_t unconfined_service_t ( lnk_file ( getattr read ))) ( allow systemd_machined_t systemd_machined_t ( cap_userns ( sys_ptrace sys_admin setgid setuid kill ))) ( allow systemd_machined_t tmpfs_t ( lnk_file ( getattr read ))) ( allow systemd_machined_t devpts_t ( chr_file ( open read write ioctl ))) ( allow systemd_machined_t tmpfs_t ( sock_file ( write ))) ( allow system_dbusd_t devpts_t ( chr_file ( read write ))) ( allow systemd_machined_t unconfined_service_t ( unix_stream_socket ( connectto ))) ( allow systemd_machined_t systemd_machined_t ( capability ( chown fowner fsetid ))) ( allow systemd_machined_t mnt_t ( dir ( read write add_name ))) ( allow systemd_machined_t mnt_t ( file ( create write open getattr setattr ioctl ))) ( allow systemd_machined_t tmp_t ( file ( create write open getattr setattr ioctl ))) ( allow systemd_machined_t user_tmp_t ( dir ( read write ))) ( allow systemd_machined_t user_tmp_t ( file ( getattr open read ))) ( allow systemd_machined_t tmpfs_t ( file ( create write open ))) and then a semodule -i my-policy.cil setsebool -P daemons_dump_core 1 That fix used to work fine. It allowed to start a container, get a shell and also log in. A 'machinectl [copy-to|copy-from]' and a 'machinectl bind' did not work with it (yet). And I didn't check any of the image commands. The first 9 lines of both fixes are identical, as far as I see. The 10th line of the new fix is new and wasn't part of the first fix. I tried the first fix on top of the new one, but it didn't enable neither the shell command nor the login - unfortunately. By the way: Can I undo all these changes and restore the situation after system installation? (it's my test and development VM, but it would help a lot).
The following command removes the my-policy module from memory: # semodule -r my-policy I will add the other machinectl commands (copy-to, copy-from, bind) into the automated test and will let you know. It's very likely that additional policy rules will be needed for successful run of these commands.
Thanks everybody for providing the reproducers. Unfortunately, the systemd-nspawn service is not SELinux confined yet, so some of the issues need to be worked around. https://github.com/fedora-selinux/selinux-policy/pull/1901 Please try the following scratchbuild to test: https://github.com/fedora-selinux/selinux-policy/pull/1901/checks?check_run_id=17688228746
Here is a PR which adds a test coverage for the `machinectl copy-from` and `machinectl copy-to` commands: * https://src.fedoraproject.org/tests/selinux/pull-request/437
*** Bug 1900888 has been marked as a duplicate of this bug. ***
1. With Fedora 38 I applied the fix supplied by Milos in 1900888 comment 19 and this one still works fine and allows to start a container as service, and perform a login and a shell command. 2. With a Fedora VM branched 20231014 I performed dnf install -y dnf-plugins-core dnf copr enable packit/fedora-selinux-selinux-policy-1901 dnf install -y selinux-policy-targeted-40.2-1.20231013181433320556.pr1901.3.g54e1f6d7b.fc39.noarch (all the other options in the example didn't work for me) The result is: in enforcing mode a systemctl start systemd-nspawn@test worked and started the container as service The machinectl commands - start <machine> - list - shell - login - enable - disable - terminate worked flawlessly - kill gave no message and no action, but repeating the command immediately a 2nd time, strangely killed the container - stop - copy-to / copy-from - poweroff - reboot - bind# threw an "access denied" in enforcing mode, but worked in permissive I couldn't test the image commands yet, because I'm in lack of a proper image. I have to create one first. The fix in its current status may not be perfect, but it would really help a lot if you could include it in F38/F39 nevertheless. It would make nspawn basically usable without complicate efforts by users/administrators. We - the Server working group - advocate it for development and test purposes and as a lightweight and easy to apply and manage containerization, if the additional administrative overhead of podman would be an overkill. So it would be huge progress. See: https://docs.fedoraproject.org/en-US/fedora-server/containerization/ (I'm currently working on an update)
Thank you Peter for testing, I really appreciate it. Exactly as you said I did not expect the fix be complete; some of the commands do not work expectedly (copy-to/from), some still seem to need attention. Fix already is in rawhide, backports to F38 and F39 are on the way.
I just merged the following PR which brings a testpolicy module which supports the machinectl commands like: copy-to, copy-from, reboot and kill: * https://src.fedoraproject.org/tests/selinux/pull-request/437 Please let me know if the testpolicy module improves the situation on your machines. Unfortunately, my attempts to support the "machinectl bind" command were not successful.
Sorry for the delay, unfortunately I was so swamped I couldn't make it. I just tried to test it, but unfortunately I couldn't figure out how to do it (I currently don't package anything, so my familiarity with the process it rather limited). Please, could you provide me with some hints how to proceed?
In order to test the testpolicy module, you can download the raw form (Raw button) of it: * https://src.fedoraproject.org/tests/selinux/blob/main/f/selinux-policy/systemd-machined-and-similar/testpolicy.cil The following command loads the testpolicy module into memory: # semodule -i testpolicy.cil Now you can do various systemd-nspawn or machinectl things. Unfortunately, the `machinectl bind` command will not work. The following command removes the testpolicy module from memory: # semodule -r testpolicy The other files stored in https://src.fedoraproject.org/tests/selinux/blob/main/f/selinux-policy/systemd-machined-and-similar/ directory can be ignored.
Sorry for delay again. I was so busy with Server release testing and fixing and the go/no-go meetings that I couldn't test earlier. I installed Fedora 39 and updates which provide selinux-policy and selinux-policy-targeted 3.9.1-1. With these the first part already works (i.e. start as service and all the container related machinectl commands), but not "machinectl-copy to/from" nor "machinectl bind". I downloaded and executed the commands above on top of that and copy-to/from comes up with "Failed to copy: Access denied" unfortunately. The script finished without any message. In permissive mode, the same command works. I don't know if the already installed selinux-policy version disturbes the script. Just in case it does I can install with F38 or F39 w/0 updates, so I can use both scripts. For the bind issue, you can include the required bind-mount(s) in fstab and it works fine, at least as long as you don't use different UID/GID for container and host.
Peter, thanks for the update, we will continue with the effort to make it fully working.