Bug 2182505
Summary: | Create a selinux policy for nbdkit | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Jonathon Jongsma <jjongsma> | ||||||||
Component: | selinux-policy | Assignee: | Nikola Knazekova <nknazeko> | ||||||||
Status: | CLOSED MIGRATED | QA Contact: | Milos Malik <mmalik> | ||||||||
Severity: | medium | Docs Contact: | |||||||||
Priority: | medium | ||||||||||
Version: | 9.2 | CC: | berrange, eblake, jsuchane, lersek, lvrabec, mmalik, mxie, nknazeko, rjones, tyan, tzheng, virt-maint, vwu, xiaodwan, zpytela | ||||||||
Target Milestone: | rc | Keywords: | MigratedToJIRA, Triaged | ||||||||
Target Release: | --- | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | Enhancement | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2023-09-19 16:59:05 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 2016527, 2176939 | ||||||||||
Attachments: |
|
Description
Jonathon Jongsma
2023-03-28 21:55:43 UTC
I have no idea about this. But I will say that we want to be able to run nbdkit directly from (a) the command line and (b) virt-v2v. Are there other programs (apart from qemu) that libvirt spawns that have similar problems? I'm not sure that there's any exact analog, but passt (mentioned above) seems similar in some ways. There are also utilities like dnsmasq and dbus that are executed by libvirt that might have some similarities. As I said, my knowledge of selinux is pretty shallow (though very slowly getting deeper), so please excuse any misunderstandings and imprecise terminology in this summary. Currently, the nbdkit binary just gets the default label for files within the /usr/sbin/ directory: system_u:object_r:bin_t:s0. In order for libvirt to be able to execute nbdkit, libvirt would thus need to be able to execute any file with this bin_t label. But the virt selinux policy does not allow executing arbitrary binaries for good reason. So if we want libvirt to be able to execute nbdkit, the minimum thing that we need is to assign a non-default file context to the nbdkit binary so that we can write a policy that will allow libvirt to execute it without allowing other binaries. The typical way to do that is to introduce a file context like nbdkit_exec_t for the binary (see policies for the executables mentioned above). We can then write policies which will enable it to transition to a context like nbdkit_t when launched from a certain selinux context (e.g. from libvirt running as svirt_t) but not when launched from a different context (e.g. from the console running as unconfined_t). And once we can get the nbdkit process to transition to an nbdkit-specific selinux context, we can write selinux policies for this context which govern what it can or can't do. That is obviously the most difficult part. It all sounds pretty reasonable. I had a look at other binaries in /usr/sbin and may of them have system_u:object_r:<binaryname>_exec_t labels: $ ll -Z /usr/sbin/ | grep -v system_u:object_r:bin_t total 102164 -rwxr-xr-x. 1 root root system_u:object_r:abrt_exec_t:s0 41640 Jul 20 2022 abrtd -rwxr-xr-x. 1 root root system_u:object_r:abrt_exec_t:s0 129232 Jul 20 2022 abrt-dbus -rwxr-xr-x. 1 root root system_u:object_r:abrt_exec_t:s0 1349 Jul 20 2022 abrt-harvest-pstoreoops -rwxr-xr-x. 1 root root system_u:object_r:abrt_exec_t:s0 8798 Jul 20 2022 abrt-harvest-vmcore -rwxr-xr-x. 1 root root system_u:object_r:acct_exec_t:s0 16392 Jul 22 2022 accton -rwxr-xr-x. 1 root root system_u:object_r:getty_exec_t:s0 58832 Aug 4 2022 agetty -rwxr-xr-x. 1 root root system_u:object_r:alsa_exec_t:s0 133808 Jul 20 2022 alsactl -rwxr-xr-x. 1 root root system_u:object_r:install_exec_t:s0 22378 Sep 13 2022 anaconda -rwxr-xr-x. 1 root root system_u:object_r:anacron_exec_t:s0 41664 Jul 21 2022 anacron -rwxr-xr-x. 1 root root system_u:object_r:crond_exec_t:s0 32760 Jul 20 2022 atd -rwxr-xr-x. 1 root root system_u:object_r:auditctl_exec_t:s0 49496 Aug 29 2022 auditctl -rwxr-xr-x. 1 root root system_u:object_r:auditd_exec_t:s0 145832 Aug 29 2022 auditd -rwxr-xr-x. 1 root root system_u:object_r:avahi_exec_t:s0 153672 Aug 5 2022 avahi-daemon -rwxr-xr-x. 1 root root system_u:object_r:dmidecode_exec_t:s0 29488 Jul 21 2022 biosdecode [etc] So I suppose we should do something like that as a first step. As to actually *how* that is done (through selinux-policy?) I don't know. I looked at the spec for dmidecode which is one which seems to use these labels. The spec itself does not label anything, and it happens through file_contexts in the targeted policy: https://src.fedoraproject.org/rpms/dmidecode/blob/rawhide/f/dmidecode.spec $ grep dmidecode /etc/selinux/targeted/contexts/files/file_contexts /usr/sbin/dmidecode -- system_u:object_r:dmidecode_exec_t:s0 /usr/sbin/ownership -- system_u:object_r:dmidecode_exec_t:s0 /usr/sbin/vpddecode -- system_u:object_r:dmidecode_exec_t:s0 /usr/sbin/biosdecode -- system_u:object_r:dmidecode_exec_t:s0 I couldn't find any bug that was filed to make this change. Anyway since dmidecode acts like a regular program (eg you can still run it from the command line even though it has this label) I suppose there is no problem making a similar change to nbdkit. The upstream selinux-policy repository has several commits on dmidecode. They are not well explained however :( The commit that introduces a policy for dmidecode seems to be 20e306e2de6e ("add dmidecode", 2005-09-27). dmidecode parses SMBIOS information from physical RAM ("/dev/mem"), hence a particular policy for it I guess. The rest of the dmidecode-related policy commits seem to be about what other programs are allowed to run dmidecode for this purpose. Three recent(-ish) examples: commit c95cf07cea17f548daee8f47b9580120b63dedc2 Author: Zdenek Pytela <zpytela> Date: Thu Aug 11 16:47:57 2022 +0200 Allow sysadm_t read raw memory devices In particular, this permission is requested when subscription-manager wants to read SMBIOS/DMI details of the system because the implementation in the python-dmidecode library reads /dev/mem directly. Addresses the following AVC denial: [...] Resolves: rhbz#2101341 commit 9a2846b5ec06620f5246bf24fa687a399b195cc1 Author: Zdenek Pytela <zpytela> Date: Tue Feb 9 10:41:13 2021 +0100 Add integrity lockdown permission into dev_read_raw_memory() Since adding the lockdown class to selinux-policy, the integrity permission starts to be checked on each access of (among others) /dev/mem, /dev/kmem, and /dev/port. The permission to read raw memory device is allowed in the dev_read_raw_memory() interface, so the the integrity lockdown permission was added to this interface. Example of services requiring this access is biosdecode running in the dmidecode_t domain. There are other interfaces to write, rx, wx raw memory, but they either call dev_read_raw_memory() directly or are called from adjacent lines in the policy. The dev_raw_memory_reader() and dev_raw_memory_write() do not seem to have effect in this matter. Resolves: rhbz#1926696 commit 432128e4ff3d060069f3cd9f02106839c064317a Author: Lukas Vrabec <lvrabec> Date: Wed Oct 25 13:25:55 2017 +0200 Allow dmidecode to read rhsmcert_log_t files I'm reassigning this bug to selinux-policy since I do not have any idea what to do about it, neither what an ideal policy should look like, nor how to implement that policy. Hopefully someone on the SELinux team can give us advice. So, I have the very beginnings of a selinux policy started here that appears to work for my extremely limited testing. It runs as unconfined_t when started from the commandline and runs as nbdkit_t when started from a virtd_t process (e.g. system libvirt). I'm sure there are a lot of things missing and a lot of things that don't adhere to best practices since this is my first time dealing with selinux policies. One thing that I have not been able to figure out is how to get this to work when I am running a libvirt process from my development environment. During development, I often run libvirtd from the shell to test things out and it therefore runs as unconfined_t. I still want libvirtd to run nbdkit in the nbdkit_t context in this scenario. If it runs it as unconfined_t, it will fail due to various virt selinux policies preventing access to things. Anyway, I'll attach what I have so far in case it is helpful at all. Created attachment 1955945 [details]
nbdkit.fc
may need to also include plugins here eventually?
Created attachment 1955946 [details]
nbdkit.if
Created attachment 1955947 [details]
nbdkit.te
It would be greatly appreciated if somebody from the selinux-policy team could help move this forward as it blocks the libvirt nbdkit support. Hi Jonathon, I appreciate you taking the time to create a nbdkit module. Could you please attach AVC denials? because these macros are too benevolent: corenet_tcp_connect_all_ports(nbdkit_t) init_search_pid_dirs(nbdkit_t) userdom_read_user_home_content_files(nbdkit_t) userdom_list_user_home_dirs(nbdkit_t) Instead of running from the console as unconfined_t, the systemctl utility is used. Further details can be found at: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html-single/using_selinux/index#creating-and-enforcing-an-selinux-policy-for-a-custom-application_writing-a-custom-selinux-policy Thank you, Nikola (In reply to Nikola Knazekova from comment #12) > Could you please attach AVC denials? because these macros are too > benevolent: > corenet_tcp_connect_all_ports(nbdkit_t) > init_search_pid_dirs(nbdkit_t) > userdom_read_user_home_content_files(nbdkit_t) > userdom_list_user_home_dirs(nbdkit_t) Yeah, I knew a lot of those were going to be far too broad, but it allowed me to move forward ;) Regarding the AVC denials: would you like me to remove my nbdkit module and set selinux to permissive mode and then run libvirt and then collect the AVC denials? > Instead of running from the console as unconfined_t, the systemctl utility > is used. Further details can be found at: > https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/ > html-single/using_selinux/index#creating-and-enforcing-an-selinux-policy-for- > a-custom-application_writing-a-custom-selinux-policy Thanks, that looks useful for some situations. But when developing libvirt, I need to often run it under gdb or enable debugging, or build it in a different build directory using a different compiler, etc. Using a systemctl unit file to launch libvirt limits flexibilty significantly and makes development more cumbersome. And currently libvirt generally works when running from the terminal with selinux. The qemu process is executed with the right selinux context, the disk images are labeled correctly, etc. I would like this to continue to work after adding nbdkit support. [Just for background, my test vm tries to spawn 2 instances of nbdkit to serve as storage for the vm. One provides access to a disk over ssh, and the other via https.] So what I actually did is removed the 4 macros that you mentioned as being too broad and re-ran my simple test case with selinux in permissive mode. After attempting to start my test vm, I got the following AVC denials: ---- time->Thu Apr 6 14:55:43 2023 type=AVC msg=audit(1680810943.035:12434): avc: denied { name_connect } for pid=3609014 comm="nbdkit" dest=22 scontext=system_u:system_r:nbdkit_t:s0:c224,c343 tcontext=system_u:object_r:ssh_port_t:s0 tclass=tcp_socket permissive=1 ---- time->Thu Apr 6 14:55:43 2023 type=AVC msg=audit(1680810943.224:12457): avc: denied { name_connect } for pid=3609074 comm="nbdkit" dest=8888 scontext=system_u:system_r:nbdkit_t:s0:c224,c343 tcontext=system_u:object_r:unreserved_port_t:s0 tclass=tcp_socket permissive=1 ---- time->Thu Apr 6 14:55:43 2023 type=AVC msg=audit(1680810943.301:12473): avc: denied { name_connect } for pid=3609014 comm="nbdkit" dest=22 scontext=system_u:system_r:nbdkit_t:s0:c224,c343 tcontext=system_u:object_r:ssh_port_t:s0 tclass=tcp_socket permissive=1 ---- time->Thu Apr 6 14:55:43 2023 type=AVC msg=audit(1680810943.461:12489): avc: denied { name_connect } for pid=3609074 comm="nbdkit" dest=8888 scontext=system_u:system_r:nbdkit_t:s0:c224,c343 tcontext=system_u:object_r:unreserved_port_t:s0 tclass=tcp_socket permissive=1 So I replaced the corenet_tcp_connect_all_ports(nbdkit_t) with the following: corenet_tcp_connect_http_port(nbdkit_t) corenet_tcp_connect_ssh_port(nbdkit_t) corenet_tcp_connect_tftp_port(nbdkit_t) Maybe that's still too broad, I'm not sure. But that still does not allow me to connect to my little local test http server that I spun up at localhost:8888. So I changed my vm configuration to point at an iso at a standard http port for now. After installing this new policy, I no longer get any AVC denials, but the guest doesn't work properly. When I switch back to enforcing mode, nbdkit fails to start despite the lack of AVC denials, and gives me an error message about not being able to validate the remote ssh host. This is presumably because the ssh-backed disk passes the parameter "known-hosts=$HOME/tmp/known-hosts" to nbdkit to allow it to validate the remote ssh host. Selinux must be denying nbdkit the ability to read this file, so I disable dontaudits with `semodule -DB`, switch back to permissive mode and re-launch my vm. Now I can see the AVC denials that are causing the failure: time->Thu Apr 6 15:30:25 2023 type=AVC msg=audit(1680813025.736:12981): avc: denied { search } for pid=3617522 comm="nbdkit" name="jjongsma" dev="dm-3" ino=9961473 scontext=system_u:system_r:nbdkit_t:s0:c237,c597 tcontext=unconfined_u:object_r:user_home_dir_t:s0 tclass=dir permissive=1 ---- time->Thu Apr 6 15:30:25 2023 type=AVC msg=audit(1680813025.736:12982): avc: denied { search } for pid=3617522 comm="nbdkit" name="work" dev="dm-3" ino=9966373 scontext=system_u:system_r:nbdkit_t:s0:c237,c597 tcontext=unconfined_u:object_r:user_home_t:s0 tclass=dir permissive=1 ---- time->Thu Apr 6 15:30:25 2023 type=AVC msg=audit(1680813025.736:12983): avc: denied { getattr } for pid=3617522 comm="nbdkit" path="/home/jjongsma/work/libvirt/_build/src" dev="dm-3" ino=10707227 scontext=system_u:system_r:nbdkit_t:s0:c237,c597 tcontext=unconfined_u:object_r:user_home_t:s0 tclass=dir permissive=1 ---- time->Thu Apr 6 15:30:25 2023 type=AVC msg=audit(1680813025.751:12984): avc: denied { search } for pid=3617522 comm="nbdkit" name="tmp" dev="dm-3" ino=9974969 scontext=system_u:system_r:nbdkit_t:s0:c237,c597 tcontext=unconfined_u:object_r:user_tmp_t:s0 tclass=dir permissive=1 ---- time->Thu Apr 6 15:30:25 2023 type=AVC msg=audit(1680813025.756:12988): avc: denied { search } for pid=3617522 comm="nbdkit" name="tmp" dev="dm-3" ino=9974969 scontext=system_u:system_r:nbdkit_t:s0:c237,c597 tcontext=unconfined_u:object_r:user_tmp_t:s0 tclass=dir permissive=1 ---- time->Thu Apr 6 15:30:25 2023 type=AVC msg=audit(1680813025.893:13014): avc: denied { search } for pid=3617579 comm="nbdkit" name="jjongsma" dev="dm-3" ino=9961473 scontext=system_u:system_r:nbdkit_t:s0:c237,c597 tcontext=unconfined_u:object_r:user_home_dir_t:s0 tclass=dir permissive=1 ---- time->Thu Apr 6 15:30:25 2023 type=AVC msg=audit(1680813025.893:13015): avc: denied { search } for pid=3617579 comm="nbdkit" name="work" dev="dm-3" ino=9966373 scontext=system_u:system_r:nbdkit_t:s0:c237,c597 tcontext=unconfined_u:object_r:user_home_t:s0 tclass=dir permissive=1 ---- time->Thu Apr 6 15:30:25 2023 type=AVC msg=audit(1680813025.893:13016): avc: denied { getattr } for pid=3617579 comm="nbdkit" path="/home/jjongsma/work/libvirt/_build/src" dev="dm-3" ino=10707227 scontext=system_u:system_r:nbdkit_t:s0:c237,c597 tcontext=unconfined_u:object_r:user_home_t:s0 tclass=dir permissive=1 ---- time->Thu Apr 6 15:30:26 2023 type=AVC msg=audit(1680813026.348:13052): avc: denied { search } for pid=3617522 comm="nbdkit" name="jjongsma" dev="dm-3" ino=9961473 scontext=system_u:system_r:nbdkit_t:s0:c237,c597 tcontext=unconfined_u:object_r:user_home_dir_t:s0 tclass=dir permissive=1 ---- time->Thu Apr 6 15:30:26 2023 type=AVC msg=audit(1680813026.348:13053): avc: denied { search } for pid=3617522 comm="nbdkit" name="tmp" dev="dm-3" ino=9974969 scontext=system_u:system_r:nbdkit_t:s0:c237,c597 tcontext=unconfined_u:object_r:user_tmp_t:s0 tclass=dir permissive=1 Those are all related to user_home_dir_t, user_home_t, user_tmp_t, etc. So it seems reasonable that they are related to the knownhosts filed mentioned above. If I move the knownhosts file out of the home directory to /tmp and configure libvirt to look in /tmp/known-hosts, then allows the nbdkit ssh disk to run correctly. It would be nice to be able to specify an arbitrary path to a known_hosts file, including files located in the home directory. That is why my initial policy allowed overly-broad access to the home directory. I'm not sure what the options are. The same issue applies to specifying the path to a ssh keyfile for authentication, but my test vm did not use that option. Eventually we also want to be able to connect to an ssh-agent socket to use for ssh authentication. This is a question for the SELinux experts ... nbdkit is plugin based. There are plugins for, eg. local file access, web server access etc. Obviously different plugins require a vastly different set of capabilities. Like some may need almost nothing, and some might have to connect to remote servers. Also the base nbdkit server requires some permissions of its own, such as listening on a socket. Only one plugin may be loaded, and we know which one it is from the command line. Once a plugin has been loaded, it stays there until the program exits. Can we adjust the SELinux policy applied (eg. from within nbdkit) once we know which plugin will be loaded? Hi Nikola, do you have any time to help me push this forward? Is the information in comment 14 helpful? > For basic testing of the nbdkit socket, I used the following command:
It would be better to use a command like nbdinfo, eg:
$ nbdinfo nbd://localhost
# sestatus SELinux status: enabled SELinuxfs mount: /sys/fs/selinux SELinux root directory: /etc/selinux Loaded policy name: targeted Current mode: enforcing Mode from config file: enforcing Policy MLS status: enabled Policy deny_unknown status: allowed Memory protection checking: actual (secure) Max kernel policy version: 33 # semodule -lfull | grep nbdkit 400 nbdkit pp # nbdinfo nbd://localhost protocol: newstyle-fixed without TLS, using structured packets export="": export-size: 67108864 (64M) content: data uri: nbd://localhost:10809/ contexts: base:allocation is_rotational: false is_read_only: false can_cache: true can_df: true can_fast_zero: false can_flush: true can_fua: true can_multi_conn: true can_trim: true can_zero: true # ps -efZ | grep nbd system_u:system_r:nbdkit_t:s0 root 4862 1 0 18:34 ? 00:00:00 /usr/sbin/nbdkit file /tmp/disk.img unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 root 4912 4601 0 18:35 pts/0 00:00:00 grep --color=auto nbd # systemctl status nbdkit.service ● nbdkit.service Loaded: loaded (/usr/lib/systemd/system/nbdkit.service; static) Active: active (running) since Mon 2023-04-17 18:34:06 CEST; 2min 3s ago TriggeredBy: ● nbdkit.socket Main PID: 4862 (nbdkit) Tasks: 1 (limit: 7785) Memory: 3.0M CPU: 22ms CGroup: /system.slice/nbdkit.service └─4862 /usr/sbin/nbdkit file /tmp/disk.img Apr 17 18:34:06 localhost.localdomain systemd[1]: Started nbdkit.service. # The /tmp/disk.img file was created using the following command: # dd if=/dev/zero of=/tmp/disk.img bs=1M count=64 Is it necessary to format the file somehow? mkfs... It'll serve whatever's in disk.img, be that blank or a filesystem or just some random data. I'm kind of interested in my question in comment 15. Depending on what plugin you're using (nbdkit-file-plugin in your example) nbdkit will need to do radically different things. For example if you replace the nbdkit command with: nbdkit curl --filter=cow --filter=xz https://download.fedoraproject.org/pub/fedora/linux/releases/37/Cloud/x86_64/images/Fedora-Cloud-Base-37-1.7.x86_64.raw.xz then the nbdkit instance will have to connect out to a remote server, which presumably would require more adjustments to the SELinux policy. Since we know which plugin is loaded, it would be useful to be able to adjust SELinux policy to be more or less confined. I had to add 3 allow rules into the nbdkit policy module during my experiments with the nbdkit command shown in comment#21: nbdkit_domtrans(init_t, nbdkit_exec_t, nbdkit_t) corenet_tcp_bind_generic_port(nbdkit_t) corenet_tcp_bind_generic_node(nbdkit_t) allow nbdkit_t self:tcp_socket { bind listen accept }; init_abstract_socket_activation(nbdkit_t) init_ioctl_stream_sockets(nbdkit_t) init_rw_stream_sockets(nbdkit_t) allow nbdkit_t self:udp_socket { read write setopt }; allow nbdkit_t tmp_t:dir { add_name remove_name write }; allow nbdkit_t tmp_t:file { create unlink }; I'm going to write an automated test which also tests other filters/plugins. Hi, yes it was helpful, thank you. It is possible to label specific path like $HOME/tmp/known-hosts and allow nbdkit to access it. What is inside this path? /home/jjongsma/work/libvirt/_build/src Can you please enable full auditing for detailed logs? 1) Open the /etc/audit/rules.d/audit.rules file in an editor. 2) Remove the following line if it exists: -a task,never 3) Add the following line to the end of the file: -w /etc/shadow -p w 4) Restart the audit daemon: # service auditd restart 5) Re-run your scenario. 6) Collect AVC denials: # ausearch -i -m avc,user_avc,selinux_err,user_selinux_err -ts today (In reply to Nikola Knazekova from comment #25) > Hi, > yes it was helpful, thank you. > > It is possible to label specific path like $HOME/tmp/known-hosts and allow > nbdkit to access it. Sure, but what would we label it? Don't we need to include that in the policy? > What is inside this path? /home/jjongsma/work/libvirt/_build/src This is simply the build directory for my checkout of the libvirt git repository. As part of bug #2016527, I'm adding nbdkit support to libvirt. So after I make changes to libvirt, I rebuild it and then run it from the working directory to do testing. (In reply to Milos Malik from comment #24) > Based on what I understood, running the nbdkit program from systemd is not > very common use-case. > > The most frequent use-case is running nbdkit as a subprocess of another > program: > * https://libguestfs.org/nbdkit-captive.1.html > > From SELinux point of view: unconfined root executes the nbdkit program and > the nbdkit process then executes another program. > > General question: if the other program is confined by SELinux, should it > transition from the nbdkit_t type to its defined type? AIUI for this specific bug -- creating a policy for nbdkit when launched from virtqemud -- the libvirt daemon (virtqemud) is running under some confined label, and it wants to run nbdkit. We want to confine nbdkit under a different label because it will be doing separate things from what libvirt can do, which should enhance security. eg. You don't want nbdkit to be able to start qemu processes, and you don't want libvirt to be able to connect to remote SSH servers. (In reply to Milos Malik from comment #28) > From a SELinux QE point of view, it's very problematic to create a modular > policy for the nbdkit program which would be specific for plugins/filters. > > There is about 23 plugins and 39 filters in various nbdkit* packages. > > Both the filters and the plugins are dynamically linked .so files. > > Based on SELinux denials which I encountered, the nbdkit process does NOT > execute other programs (which would enable an easy change of SELinux context > on the newly executed process), but the nbdkit process loads the .so files > and calls functions offered by them. Yes it's true that nbdkit plugins are *.so files loaded into the same process as nbdkit using dlopen. If the only way to transition to a different SELinux context is to run another program then that answers my question. Incidentally, some *plugins* definitely do execute other programs, eg. https://libguestfs.org/nbdkit-linuxdisk-plugin.1.html runs mkfs https://libguestfs.org/nbdkit-sh-plugin.3.html runs anything However those plugins are not going to be used by libvirt. (In reply to Richard W.M. Jones from comment #15) > This is a question for the SELinux experts ... snip > Only one plugin may be loaded, and we know which one it is from the command > line. > Once a plugin has been loaded, it stays there until the program exits. > > Can we adjust the SELinux policy applied (eg. from within nbdkit) once we > know > which plugin will be loaded? Yes & no. The SELinux context that a process runs under is determined at time of exec. The default behaviour is that SELinux policy declares an automatic transition rule, eg when a process foo_t execs() a binary with file label bar_exec_t, the processed will get 'bar_t' as its runtime context It is possible to override this behaviour by calling setexeccon(other_bar_t). The policy still has to be written to permit a transition from foo_t to other_bar_t. We're just skipping the file label based lookup. IOW, if nbdkit wants to self-adjust its policy it would have to see which plugin it has configured, call setexeccon() and then re-execve() itself to trigger the transition. So we would end up with two execs + transitions: libvirtd (virtd_t) ---exec(/usr/bin/nbdkit - nbdkit_exec_t)---> nbdkit (nbdkit_t) nbdkit (nbdkit_t) ---setexeccon(nbdkit_plugin_curl_t)+exec(/usr/bin/nbdkit)---> nbdkit (nbdkit_plugin_curl_t) The first is an automatic transition based on file label, the second is a manual transition nbdkit decides todo based on the plugin chosen The alternative is to not try to confine nbdkit by default at all and instead just let libvirtd say exactly what to run so we only have 1 exec/transition: libvirtd (virtd_t) ---setexeccon(nbdkit_plugin_curl_t)+exec(/usr/bin/nbdkit)---> nbdkit (nbdkit_plugin_curl_t) There are some potential complications.. For curl/ssh I think all the file based resources that need to be accessed are common to every invokation and a finite predictable set. IOW, we can write the plugin policy to allow access and just "do the right thing". For other plugins though, the files to be accessed are essentially arbitrary paths. eg the 'file' plugin can serve any file at all. Either we write th "file" plugin policy to allow it to serve pretty much any file, or we require the files to be given a label ahead of time (eg an nbdkit_image_t akin to virt_image_t). If we're doing the latter though, we don't want 2 parallel invokations of nbdkit to access each others' disk, so would need to use MCS like we do with svirt. eg foo.qcow2 labelled nbdkit_image_t:c123,c321 and bar.qcow2 labelled nbdkit_image_t:c456,c654. At that point the call to setexeccon() is mandatory, as it needs to specify the MCS to use eg nbdkit_plugin_file_t:c123,c321 to given access to foo.qcow2. The next question is who sets the MCS labels on the files. WOuld nbdkit set the MCS lablel on foo.qcow2 before re-exec'ing itself, or should the thing that calls nbdkit set it before hand. In the latter case, there's no point in nbdkit re-exec'ing itself as the caller is better off just using setexeccon(nbdkit_plugin_file_t:c123,c321) There are alot of nbdkit plugins and I doubt we're going to write policy for all of them straightaway. If we make nbdkit always start off with a nbdkit_t type and then re-exec itself, then we need to have a fallback "nbdkit_plugin_unconfined_t" that allows pretty much everything. IOW, for curl/ssh it would use the locked down nbdkit_plugin_curl/ssh_t, but for other plugins it would use the unconfined variant A more pragmatic (simpler/less disruptive) option would be to leave the entire nbdkit process unconfined indefinitely, and just expect libvirtd to use setexeccon() to transition directly to a nbdkit_plugin_curl_t / plugin_ssh_t type. IOW, only libvirt's usage would benefit from SELinux and other nbdkit users are unaffected Thanks, very interesting. Re-execing within nbdkit is already done for one plugin (not for SELinux reasons, but because it is necessary for adjusting LD_LIBRARY_PATH - https://gitlab.com/nbdkit/nbdkit/-/tree/master/plugins/vddk). It has a number of subtle pitfalls and I'd prefer to avoid having to do it, especially for every invocation. On the other hand having libvirtd call setexeccon sounds a much better idea. If we had a policy for a few well-known plugins, then we could even reuse the same policies and labels with virt-v2v, so this wouldn't be a one-off thing for libvirt. Virt-v2v also uses ssh and curl plugins and could benefit the same way as libvirt. We could make the policy names discoverable, eg it would be easy to add an selinux_label=... field to the output of: $ nbdkit curl --dump-plugin path=/usr/lib64/nbdkit/plugins/nbdkit-curl-plugin.so name=curl version=1.34.0 api_version=1 [etc] I agree that labelling for nbdkit-file-plugin sounds very complicated, so we could ignore that one for now. There is a slight complication that filters may also access resources, but I suspect we can ignore that by having a policy that allows some common operations like writing to /tmp and /var/tmp, in the name of simplicity. (In reply to Daniel Berrangé from comment #30) > The alternative is to not try to confine nbdkit by default at all and > instead just let libvirtd say exactly what to run so we only have 1 > exec/transition: > > libvirtd (virtd_t) > ---setexeccon(nbdkit_plugin_curl_t)+exec(/usr/bin/nbdkit)---> nbdkit > (nbdkit_plugin_curl_t) The thing I'm struggling with (as a relative selinux beginner) is where libvirt would get the appropriate value for the context that we need to set here? I don't think we would want to hardcode a literal 'nbdkit_plugin_curl_t' into libvirt source code, would we?. That seems a bit fragile since the libvirt source code would need to be updated if the selinux policy was ever changed. In the case of qemu, we don't hardcode 'svirt_t'. We get the appropriate process context (which happens to be svirt_t) by parsing the file returned by selinux_virtual_domain_context_path() and using that label for the qemu process. But what would we do for nbdkit? If we used the automatic transition based on the file label (nbdkit_t), that's fairly easy to determine by looking it up from the file (just like libvirt did for passst). But if we start adding non-automatic transitions like nbdkit_plugin_curl_t, how would we manage those? (In reply to Jonathon Jongsma from comment #32) > The thing I'm struggling with (as a relative selinux beginner) is where > libvirt would get the appropriate value for the context that we need to set > here? I don't think we would want to hardcode a literal > 'nbdkit_plugin_curl_t' into libvirt source code, would we?. That seems a bit > fragile since the libvirt source code would need to be updated if the > selinux policy was ever changed. In the case of qemu, we don't hardcode > 'svirt_t'. We get the appropriate process context (which happens to be > svirt_t) by parsing the file returned by > selinux_virtual_domain_context_path() and using that label for the qemu > process. But what would we do for nbdkit? Just use selinux_contexts_path() and concatenate "nbdkit_plugins" and that file can have something like $ cat nbdkit_plugins curl:nbdkit_plugin_curl_t ssh:nbdkit_plugin_ssh_t .... I also suggested (comment 31) that we could get: $ nbdkit <plugin> --dump-plugin to print the right context (or nothing if there is no support for the plugin). That would allow us to adjust nbdkit dynamically as new SELinux support is added without needing to change libvirt. However I don't know if this is better than the method Daniel has suggested. One problem that I'm currently running into with your draft policy is the following. Because libvirt uses MCS to isolate different guests from eachother, we need to set the categories properly when spawning these helper processes. And so for nbdkit, I'm following what is done for the passt utility, which is the following: When libvirt prepares to launch the child nbdkit process, it uses libselinux APIs to calculate what label we should use for the child process. So, in the function mentioned in [1], we have the following rough steps: getcon(¤tCon) // should return "virtd_t" ... getfilecon("/usr/sbin/nbdkit", &binaryCon) // should return "nbdkit_exec_t" ... security_compute_create(currentCon, binaryCon, string_to_security_class("process"), &naturalLabel) // should return "nbdkit_t" and then we would append the appropriate MCS range to that label and then spawn the process with that label. When I tested with my draft policy, it succeeded and I ended up with a label such as "nbdkit_t:s0:c228,c724". But when I test this with your policy, the security_compute_create() call fails. I'm sure why it is failing, and of course that failure doesn't produce any AVC logs. [1] https://gitlab.com/libvirt/libvirt/-/blob/0324adb647885932efc97eefcfe08f6a8db60ae1/src/security/security_selinux.c#L565 - see function virSecuritySELinuxContextSetFromFile() Nikola, any ideas about comment #36? Hi Jonathon, sorry for late response, I have to discuss this issue with colleagues. For now I updated copr build (38.13-1.fc39.601), where MCS should be fixed. Can you please test it? Thank you I still seem to be getting the same errors mentioned in comment 36. $ rpm -qa |grep selinux-policy selinux-policy-targeted-38.13-1.fc37.601.noarch selinux-policy-devel-38.13-1.fc37.601.noarch selinux-policy-38.13-1.fc37.601.noarch I get the following error from libvirt, which is printed because security_compute_create() fails: virSecuritySELinuxContextSetFromFile:591 : unable create new SELinux label based on label 'system_u:system_r:svirt_t:s0:c227,c659' and file '/usr/sbin/nbdkit': Permission denied Jonathon, can you check if there are any selinux errors? # ausearch -i -m selinux_err Oh, I actually do see something here: type=PROCTITLE msg=audit(05/10/2023 16:41:09.113:2225) : proctitle=libvirtd --verbose type=SYSCALL msg=audit(05/10/2023 16:41:09.113:2225) : arch=x86_64 syscall=write success=no exit=EACCES(Permission denied) a0=0x2a a1=0x7f41a4032f90 a2=0x4d a3=0x7f41bcff9c47 items=0 ppid=1300984 pid=1301044 auid=jjongsma uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=pts1 ses=3 comm=rpc-libvirtd exe=/home/jjongsma/work/libvirt/_build/src/libvirtd subj=system_u:system_r:virtd_t:s0-s0:c0.c1023 key=(null) type=SELINUX_ERR msg=audit(05/10/2023 16:41:09.113:2225) : op=security_compute_sid invalid_context=system_u:system_r:nbdkit_t:s0-s0:c0.c1023 scontext=system_u:system_r:virtd_t:s0-s0:c0.c1023 tcontext=system_u:object_r:nbdkit_exec_t:s0 tclass=process Zdenko, Milos, can you please look at this bug? I think nbdkit_t is not assigned to the system_r role. Any update on this? Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug. This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there. Due to differences in account names between systems, some fields were not replicated. Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information. To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer. You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like: "Bugzilla Bug" = 1234567 In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information. |