This bug has been migrated to another issue tracking site. It has been closed here and may no longer be being monitored.

If you would like to get updates for this issue, or to participate in it, you may do so at Red Hat Issue Tracker .
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2182505 - Create a selinux policy for nbdkit
Summary: Create a selinux policy for nbdkit
Keywords:
Status: CLOSED MIGRATED
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: selinux-policy
Version: 9.2
Hardware: Unspecified
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Nikola Knazekova
QA Contact: Milos Malik
URL:
Whiteboard:
Depends On:
Blocks: 2016527 2176939
TreeView+ depends on / blocked
 
Reported: 2023-03-28 21:55 UTC by Jonathon Jongsma
Modified: 2023-10-03 08:13 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-09-19 16:59:05 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
nbdkit.fc (69 bytes, text/plain)
2023-04-05 18:07 UTC, Jonathon Jongsma
no flags Details
nbdkit.if (1.72 KB, text/plain)
2023-04-05 18:09 UTC, Jonathon Jongsma
no flags Details
nbdkit.te (2.21 KB, text/plain)
2023-04-05 18:09 UTC, Jonathon Jongsma
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker   RHEL-5174 0 None Migrated None 2023-10-03 08:13:11 UTC
Red Hat Issue Tracker RHELPLAN-153350 0 None None None 2023-03-28 21:56:56 UTC

Internal Links: 2182024

Description Jonathon Jongsma 2023-03-28 21:55:43 UTC
In adding nbdkit support to libvirt, I've run into several issues. First of all, libvirt is unable to spawn nbdkit right now due to virt selinux policies. This was filed as Bug 2176939.

Since the selinux context of nbdkit is currently system_u:object_r:bin_t:s0, libvirt is not permitted to spawn nbdkit. In order to craft a policy that would allow libvirt to spawn nbdkit, we'll presumably need to assign it a context that could be distinguished from other binaries (perhaps introducing something like nbdkit_exec_t/nbdkit_t).

But libvirt will also want to isolate nbdkit from other guests and the rest of the filesystem while allowing it access to things like ssh-agent socket, etc.

So this bug is about creating a policy for nbdkit. Bug 2176939 will be about updating the virt policy to interact with the nbdkit policy.

In discussing this with Daniel Berrange, he suggested that we might basically need two different policies for nbdkit since a policy that is suitable for libvirt's needs will be too strict for other uses of ndkit. He pointed to qemu as an example of a binary that has a slightly analogous scenario with the different svirt_t and svirt_tcg_t policies for KVM vs TCG emulation.

See also Bug 2172268 for a very similar situation with passt, which maintains its own selinux policy.

Comment 1 Richard W.M. Jones 2023-04-04 14:54:31 UTC
I have no idea about this.  But I will say that we want to be able to run nbdkit
directly from (a) the command line and (b) virt-v2v.

Are there other programs (apart from qemu) that libvirt spawns that have
similar problems?

Comment 2 Jonathon Jongsma 2023-04-04 20:05:48 UTC
I'm not sure that there's any exact analog, but passt (mentioned above) seems similar in some ways. There are also utilities like dnsmasq and dbus that are executed by libvirt that might have some similarities. 

As I said, my knowledge of selinux is pretty shallow (though very slowly getting deeper), so please excuse any misunderstandings and imprecise terminology in this summary. 

Currently, the nbdkit binary just gets the default label for files within the /usr/sbin/ directory: system_u:object_r:bin_t:s0. In order for libvirt to be able to execute nbdkit, libvirt would thus need to be able to execute any file with this bin_t label. But the virt selinux policy does not allow executing arbitrary binaries for good reason. So if we want libvirt to be able to execute nbdkit, the minimum thing that we need is to assign a non-default file context to the nbdkit binary so that we can write a policy that will allow libvirt to execute it without allowing other binaries. The typical way to do that is to introduce a file context like nbdkit_exec_t for the binary (see policies for the executables mentioned above). We can then write policies which will enable it to transition to a context like nbdkit_t when launched from a certain selinux context (e.g. from libvirt running as svirt_t) but not when launched from a different context (e.g. from the console running as unconfined_t). And once we can get the nbdkit process to transition to an nbdkit-specific selinux context, we can write selinux policies for this context which govern what it can or can't do. That is obviously the most difficult part.

Comment 3 Richard W.M. Jones 2023-04-05 08:14:06 UTC
It all sounds pretty reasonable.  I had a look at other binaries in /usr/sbin
and may of them have system_u:object_r:<binaryname>_exec_t labels:

$ ll -Z /usr/sbin/ | grep -v system_u:object_r:bin_t
total 102164
-rwxr-xr-x. 1 root root system_u:object_r:abrt_exec_t:s0               41640 Jul 20  2022 abrtd
-rwxr-xr-x. 1 root root system_u:object_r:abrt_exec_t:s0              129232 Jul 20  2022 abrt-dbus
-rwxr-xr-x. 1 root root system_u:object_r:abrt_exec_t:s0                1349 Jul 20  2022 abrt-harvest-pstoreoops
-rwxr-xr-x. 1 root root system_u:object_r:abrt_exec_t:s0                8798 Jul 20  2022 abrt-harvest-vmcore
-rwxr-xr-x. 1 root root system_u:object_r:acct_exec_t:s0               16392 Jul 22  2022 accton
-rwxr-xr-x. 1 root root system_u:object_r:getty_exec_t:s0              58832 Aug  4  2022 agetty
-rwxr-xr-x. 1 root root system_u:object_r:alsa_exec_t:s0              133808 Jul 20  2022 alsactl
-rwxr-xr-x. 1 root root system_u:object_r:install_exec_t:s0            22378 Sep 13  2022 anaconda
-rwxr-xr-x. 1 root root system_u:object_r:anacron_exec_t:s0            41664 Jul 21  2022 anacron
-rwxr-xr-x. 1 root root system_u:object_r:crond_exec_t:s0              32760 Jul 20  2022 atd
-rwxr-xr-x. 1 root root system_u:object_r:auditctl_exec_t:s0           49496 Aug 29  2022 auditctl
-rwxr-xr-x. 1 root root system_u:object_r:auditd_exec_t:s0            145832 Aug 29  2022 auditd
-rwxr-xr-x. 1 root root system_u:object_r:avahi_exec_t:s0             153672 Aug  5  2022 avahi-daemon
-rwxr-xr-x. 1 root root system_u:object_r:dmidecode_exec_t:s0          29488 Jul 21  2022 biosdecode
[etc]

So I suppose we should do something like that as a first step.

As to actually *how* that is done (through selinux-policy?) I don't know.
I looked at the spec for dmidecode which is one which seems to use these
labels.  The spec itself does not label anything, and it happens through
file_contexts in the targeted policy:

https://src.fedoraproject.org/rpms/dmidecode/blob/rawhide/f/dmidecode.spec

$ grep dmidecode /etc/selinux/targeted/contexts/files/file_contexts
/usr/sbin/dmidecode	--	system_u:object_r:dmidecode_exec_t:s0
/usr/sbin/ownership	--	system_u:object_r:dmidecode_exec_t:s0
/usr/sbin/vpddecode	--	system_u:object_r:dmidecode_exec_t:s0
/usr/sbin/biosdecode	--	system_u:object_r:dmidecode_exec_t:s0

I couldn't find any bug that was filed to make this change.

Anyway since dmidecode acts like a regular program (eg you can still run it
from the command line even though it has this label) I suppose there is no
problem making a similar change to nbdkit.

Comment 4 Laszlo Ersek 2023-04-05 09:21:00 UTC
The upstream selinux-policy repository has several commits on dmidecode. They are not well explained however :(

The commit that introduces a policy for dmidecode seems to be 20e306e2de6e ("add dmidecode", 2005-09-27). dmidecode parses SMBIOS information from physical RAM ("/dev/mem"), hence a particular policy for it I guess. The rest of the dmidecode-related policy commits seem to be about what other programs are allowed to run dmidecode for this purpose. Three recent(-ish) examples:

commit c95cf07cea17f548daee8f47b9580120b63dedc2
Author: Zdenek Pytela <zpytela>
Date:   Thu Aug 11 16:47:57 2022 +0200

    Allow sysadm_t read raw memory devices
    
    In particular, this permission is requested when subscription-manager
    wants to read SMBIOS/DMI details of the system because the implementation
    in the python-dmidecode library reads /dev/mem directly.
    
    Addresses the following AVC denial:
    [...]
    Resolves: rhbz#2101341

commit 9a2846b5ec06620f5246bf24fa687a399b195cc1
Author: Zdenek Pytela <zpytela>
Date:   Tue Feb 9 10:41:13 2021 +0100

    Add integrity lockdown permission into dev_read_raw_memory()
    
    Since adding the lockdown class to selinux-policy, the integrity
    permission starts to be checked on each access of (among others)
    /dev/mem, /dev/kmem, and /dev/port.
    The permission to read raw memory device is allowed in the
    dev_read_raw_memory() interface, so the the integrity lockdown
    permission was added to this interface.
    Example of services requiring this access is biosdecode running
    in the dmidecode_t domain.
    
    There are other interfaces to write, rx, wx raw memory, but they
    either call dev_read_raw_memory() directly or are called from adjacent
    lines in the policy.
    
    The dev_raw_memory_reader() and dev_raw_memory_write() do not seem
    to have effect in this matter.
    
    Resolves: rhbz#1926696

commit 432128e4ff3d060069f3cd9f02106839c064317a
Author: Lukas Vrabec <lvrabec>
Date:   Wed Oct 25 13:25:55 2017 +0200

    Allow dmidecode to read rhsmcert_log_t files

Comment 6 Richard W.M. Jones 2023-04-05 11:35:18 UTC
I'm reassigning this bug to selinux-policy since I do not have any idea
what to do about it, neither what an ideal policy should look like, nor
how to implement that policy.  Hopefully someone on the SELinux team
can give us advice.

Comment 7 Jonathon Jongsma 2023-04-05 15:35:41 UTC
So, I have the very beginnings of a selinux policy started here that appears to work for my extremely limited testing. It runs as unconfined_t when started from the commandline and runs as nbdkit_t when started from a virtd_t process (e.g. system libvirt). I'm sure there are a lot of things missing and a lot of things that don't adhere to best practices since this is my first time dealing with selinux policies. 

One thing that I have not been able to figure out is how to get this to work when I am running a libvirt process from my development environment. During development, I often run libvirtd from the shell to test things out and it therefore runs as unconfined_t. I still want libvirtd to run nbdkit in the nbdkit_t context in this scenario. If it runs it as unconfined_t, it will fail due to various virt selinux policies preventing access to things. 

Anyway, I'll attach what I have so far in case it is helpful at all.

Comment 8 Jonathon Jongsma 2023-04-05 18:07:54 UTC
Created attachment 1955945 [details]
nbdkit.fc

may need to also include plugins here eventually?

Comment 9 Jonathon Jongsma 2023-04-05 18:09:06 UTC
Created attachment 1955946 [details]
nbdkit.if

Comment 10 Jonathon Jongsma 2023-04-05 18:09:52 UTC
Created attachment 1955947 [details]
nbdkit.te

Comment 11 Jonathon Jongsma 2023-04-05 18:13:54 UTC
It would be greatly appreciated if somebody from the selinux-policy team could help move this forward as it blocks the libvirt nbdkit support.

Comment 12 Nikola Knazekova 2023-04-06 13:34:07 UTC
Hi Jonathon, 

I appreciate you taking the time to create a nbdkit module.

Could you please attach AVC denials? because these macros are too benevolent: 
   corenet_tcp_connect_all_ports(nbdkit_t)
   init_search_pid_dirs(nbdkit_t)
   userdom_read_user_home_content_files(nbdkit_t)
   userdom_list_user_home_dirs(nbdkit_t)

Instead of running from the console as unconfined_t, the systemctl utility is used. Further details can be found at: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html-single/using_selinux/index#creating-and-enforcing-an-selinux-policy-for-a-custom-application_writing-a-custom-selinux-policy

Thank you,
Nikola

Comment 13 Jonathon Jongsma 2023-04-06 14:24:42 UTC
(In reply to Nikola Knazekova from comment #12)
> Could you please attach AVC denials? because these macros are too
> benevolent: 
>    corenet_tcp_connect_all_ports(nbdkit_t)
>    init_search_pid_dirs(nbdkit_t)
>    userdom_read_user_home_content_files(nbdkit_t)
>    userdom_list_user_home_dirs(nbdkit_t)

Yeah, I knew a lot of those were going to be far too broad, but it allowed me to move forward ;)

Regarding the AVC denials: would you like me to remove my nbdkit module and set selinux to permissive mode and then run libvirt and then collect the AVC denials?

> Instead of running from the console as unconfined_t, the systemctl utility
> is used. Further details can be found at:
> https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/
> html-single/using_selinux/index#creating-and-enforcing-an-selinux-policy-for-
> a-custom-application_writing-a-custom-selinux-policy

Thanks, that looks useful for some situations. But when developing libvirt, I need to often run it under gdb or enable debugging, or build it in a different build directory using a different compiler, etc. Using a systemctl unit file to launch libvirt limits flexibilty significantly and makes development more cumbersome. And currently libvirt generally works when running from the terminal with selinux. The qemu process is executed with the right selinux context, the disk images are labeled correctly, etc. I would like this to continue to work after adding nbdkit support.

Comment 14 Jonathon Jongsma 2023-04-06 22:07:21 UTC
[Just for background, my test vm tries to spawn 2 instances of nbdkit to serve as storage for the vm. One provides access to a disk over ssh, and the other via https.]

So what I actually did is removed the 4 macros that you mentioned as being too broad and re-ran my simple test case with selinux in permissive mode. After attempting to start my test vm, I got the following AVC denials:

----
time->Thu Apr  6 14:55:43 2023
type=AVC msg=audit(1680810943.035:12434): avc:  denied  { name_connect } for  pid=3609014 comm="nbdkit" dest=22 scontext=system_u:system_r:nbdkit_t:s0:c224,c343 tcontext=system_u:object_r:ssh_port_t:s0 tclass=tcp_socket permissive=1
----
time->Thu Apr  6 14:55:43 2023
type=AVC msg=audit(1680810943.224:12457): avc:  denied  { name_connect } for  pid=3609074 comm="nbdkit" dest=8888 scontext=system_u:system_r:nbdkit_t:s0:c224,c343 tcontext=system_u:object_r:unreserved_port_t:s0 tclass=tcp_socket permissive=1
----
time->Thu Apr  6 14:55:43 2023
type=AVC msg=audit(1680810943.301:12473): avc:  denied  { name_connect } for  pid=3609014 comm="nbdkit" dest=22 scontext=system_u:system_r:nbdkit_t:s0:c224,c343 tcontext=system_u:object_r:ssh_port_t:s0 tclass=tcp_socket permissive=1
----
time->Thu Apr  6 14:55:43 2023
type=AVC msg=audit(1680810943.461:12489): avc:  denied  { name_connect } for  pid=3609074 comm="nbdkit" dest=8888 scontext=system_u:system_r:nbdkit_t:s0:c224,c343 tcontext=system_u:object_r:unreserved_port_t:s0 tclass=tcp_socket permissive=1


So I replaced the corenet_tcp_connect_all_ports(nbdkit_t) with the following:
  corenet_tcp_connect_http_port(nbdkit_t)
  corenet_tcp_connect_ssh_port(nbdkit_t)
  corenet_tcp_connect_tftp_port(nbdkit_t)

Maybe that's still too broad, I'm not sure. But that still does not allow me to connect to my little local test http server that I spun up at localhost:8888. So I changed my vm configuration to point at an iso at a standard http port for now.

After installing this new policy, I no longer get any AVC denials, but the guest doesn't work properly. When I switch back to enforcing mode, nbdkit fails to start despite the lack of AVC denials, and gives me an error message about not being able to validate the remote ssh host. This is presumably because the ssh-backed disk passes the parameter "known-hosts=$HOME/tmp/known-hosts" to nbdkit to allow it to validate the remote ssh host. Selinux must be denying nbdkit the ability to read this file, so I disable dontaudits with `semodule -DB`, switch back to permissive mode and re-launch my vm. Now I can see the AVC denials that are causing the failure:

time->Thu Apr  6 15:30:25 2023
type=AVC msg=audit(1680813025.736:12981): avc:  denied  { search } for  pid=3617522 comm="nbdkit" name="jjongsma" dev="dm-3" ino=9961473 scontext=system_u:system_r:nbdkit_t:s0:c237,c597 tcontext=unconfined_u:object_r:user_home_dir_t:s0 tclass=dir permissive=1
----
time->Thu Apr  6 15:30:25 2023
type=AVC msg=audit(1680813025.736:12982): avc:  denied  { search } for  pid=3617522 comm="nbdkit" name="work" dev="dm-3" ino=9966373 scontext=system_u:system_r:nbdkit_t:s0:c237,c597 tcontext=unconfined_u:object_r:user_home_t:s0 tclass=dir permissive=1
----
time->Thu Apr  6 15:30:25 2023
type=AVC msg=audit(1680813025.736:12983): avc:  denied  { getattr } for  pid=3617522 comm="nbdkit" path="/home/jjongsma/work/libvirt/_build/src" dev="dm-3" ino=10707227 scontext=system_u:system_r:nbdkit_t:s0:c237,c597 tcontext=unconfined_u:object_r:user_home_t:s0 tclass=dir permissive=1
----
time->Thu Apr  6 15:30:25 2023
type=AVC msg=audit(1680813025.751:12984): avc:  denied  { search } for  pid=3617522 comm="nbdkit" name="tmp" dev="dm-3" ino=9974969 scontext=system_u:system_r:nbdkit_t:s0:c237,c597 tcontext=unconfined_u:object_r:user_tmp_t:s0 tclass=dir permissive=1
----
time->Thu Apr  6 15:30:25 2023
type=AVC msg=audit(1680813025.756:12988): avc:  denied  { search } for  pid=3617522 comm="nbdkit" name="tmp" dev="dm-3" ino=9974969 scontext=system_u:system_r:nbdkit_t:s0:c237,c597 tcontext=unconfined_u:object_r:user_tmp_t:s0 tclass=dir permissive=1
----
time->Thu Apr  6 15:30:25 2023
type=AVC msg=audit(1680813025.893:13014): avc:  denied  { search } for  pid=3617579 comm="nbdkit" name="jjongsma" dev="dm-3" ino=9961473 scontext=system_u:system_r:nbdkit_t:s0:c237,c597 tcontext=unconfined_u:object_r:user_home_dir_t:s0 tclass=dir permissive=1
----
time->Thu Apr  6 15:30:25 2023
type=AVC msg=audit(1680813025.893:13015): avc:  denied  { search } for  pid=3617579 comm="nbdkit" name="work" dev="dm-3" ino=9966373 scontext=system_u:system_r:nbdkit_t:s0:c237,c597 tcontext=unconfined_u:object_r:user_home_t:s0 tclass=dir permissive=1
----
time->Thu Apr  6 15:30:25 2023
type=AVC msg=audit(1680813025.893:13016): avc:  denied  { getattr } for  pid=3617579 comm="nbdkit" path="/home/jjongsma/work/libvirt/_build/src" dev="dm-3" ino=10707227 scontext=system_u:system_r:nbdkit_t:s0:c237,c597 tcontext=unconfined_u:object_r:user_home_t:s0 tclass=dir permissive=1
----
time->Thu Apr  6 15:30:26 2023
type=AVC msg=audit(1680813026.348:13052): avc:  denied  { search } for  pid=3617522 comm="nbdkit" name="jjongsma" dev="dm-3" ino=9961473 scontext=system_u:system_r:nbdkit_t:s0:c237,c597 tcontext=unconfined_u:object_r:user_home_dir_t:s0 tclass=dir permissive=1
----
time->Thu Apr  6 15:30:26 2023
type=AVC msg=audit(1680813026.348:13053): avc:  denied  { search } for  pid=3617522 comm="nbdkit" name="tmp" dev="dm-3" ino=9974969 scontext=system_u:system_r:nbdkit_t:s0:c237,c597 tcontext=unconfined_u:object_r:user_tmp_t:s0 tclass=dir permissive=1


Those are all related to user_home_dir_t, user_home_t, user_tmp_t, etc. So it seems reasonable that they are related to the knownhosts filed mentioned above. If I move the knownhosts file out of the home directory to /tmp and configure libvirt to look in /tmp/known-hosts, then  allows the nbdkit ssh disk to run correctly.

It would be nice to be able to specify an arbitrary path to a known_hosts file, including files located in the home directory. That is why my initial policy allowed overly-broad access to the home directory. I'm not sure what the options are. The same issue applies to specifying the path to a ssh keyfile for authentication, but my test vm did not use that option. Eventually we also want to be able to connect to an ssh-agent socket to use for ssh authentication.

Comment 15 Richard W.M. Jones 2023-04-11 08:17:38 UTC
This is a question for the SELinux experts ...

nbdkit is plugin based.  There are plugins for, eg. local file access, web server access etc.
Obviously different plugins require a vastly different set of capabilities.  Like
some may need almost nothing, and some might have to connect to remote servers.
Also the base nbdkit server requires some permissions of its own, such as listening
on a socket.

Only one plugin may be loaded, and we know which one it is from the command line.
Once a plugin has been loaded, it stays there until the program exits.

Can we adjust the SELinux policy applied (eg. from within nbdkit) once we know
which plugin will be loaded?

Comment 17 Jonathon Jongsma 2023-04-17 14:45:06 UTC
Hi Nikola, do you have any time to help me push this forward? Is the information in comment 14 helpful?

Comment 19 Richard W.M. Jones 2023-04-17 16:19:50 UTC
> For basic testing of the nbdkit socket, I used the following command:

It would be better to use a command like nbdinfo, eg:

$ nbdinfo nbd://localhost

Comment 20 Milos Malik 2023-04-17 16:40:31 UTC
# sestatus 
SELinux status:                 enabled
SELinuxfs mount:                /sys/fs/selinux
SELinux root directory:         /etc/selinux
Loaded policy name:             targeted
Current mode:                   enforcing
Mode from config file:          enforcing
Policy MLS status:              enabled
Policy deny_unknown status:     allowed
Memory protection checking:     actual (secure)
Max kernel policy version:      33
# semodule -lfull | grep nbdkit
400 nbdkit            pp          
# nbdinfo nbd://localhost
protocol: newstyle-fixed without TLS, using structured packets
export="":
	export-size: 67108864 (64M)
	content: data
	uri: nbd://localhost:10809/
	contexts:
		base:allocation
	is_rotational: false
	is_read_only: false
	can_cache: true
	can_df: true
	can_fast_zero: false
	can_flush: true
	can_fua: true
	can_multi_conn: true
	can_trim: true
	can_zero: true
# ps -efZ | grep nbd
system_u:system_r:nbdkit_t:s0   root        4862       1  0 18:34 ?        00:00:00 /usr/sbin/nbdkit file /tmp/disk.img
unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 root 4912 4601  0 18:35 pts/0 00:00:00 grep --color=auto nbd
# systemctl status nbdkit.service
● nbdkit.service
     Loaded: loaded (/usr/lib/systemd/system/nbdkit.service; static)
     Active: active (running) since Mon 2023-04-17 18:34:06 CEST; 2min 3s ago
TriggeredBy: ● nbdkit.socket
   Main PID: 4862 (nbdkit)
      Tasks: 1 (limit: 7785)
     Memory: 3.0M
        CPU: 22ms
     CGroup: /system.slice/nbdkit.service
             └─4862 /usr/sbin/nbdkit file /tmp/disk.img

Apr 17 18:34:06 localhost.localdomain systemd[1]: Started nbdkit.service.
#

The /tmp/disk.img file was created using the following command:

# dd if=/dev/zero of=/tmp/disk.img bs=1M count=64

Is it necessary to format the file somehow? mkfs...

Comment 21 Richard W.M. Jones 2023-04-17 17:09:20 UTC
It'll serve whatever's in disk.img, be that blank or a filesystem or just some
random data.

I'm kind of interested in my question in comment 15.

Depending on what plugin you're using (nbdkit-file-plugin in your example) nbdkit
will need to do radically different things.

For example if you replace the nbdkit command with:

  nbdkit curl --filter=cow --filter=xz https://download.fedoraproject.org/pub/fedora/linux/releases/37/Cloud/x86_64/images/Fedora-Cloud-Base-37-1.7.x86_64.raw.xz

then the nbdkit instance will have to connect out to a remote server, which
presumably would require more adjustments to the SELinux policy.  Since we
know which plugin is loaded, it would be useful to be able to adjust
SELinux policy to be more or less confined.

Comment 22 Milos Malik 2023-04-18 10:08:41 UTC
I had to add 3 allow rules into the nbdkit policy module during my experiments with the nbdkit command shown in comment#21:

nbdkit_domtrans(init_t, nbdkit_exec_t, nbdkit_t)
corenet_tcp_bind_generic_port(nbdkit_t)
corenet_tcp_bind_generic_node(nbdkit_t)
allow nbdkit_t self:tcp_socket { bind listen accept };
init_abstract_socket_activation(nbdkit_t)
init_ioctl_stream_sockets(nbdkit_t)
init_rw_stream_sockets(nbdkit_t)
allow nbdkit_t self:udp_socket { read write setopt };
allow nbdkit_t tmp_t:dir { add_name remove_name write };
allow nbdkit_t tmp_t:file { create unlink };

I'm going to write an automated test which also tests other filters/plugins.

Comment 25 Nikola Knazekova 2023-04-18 16:12:09 UTC
Hi, 
yes it was helpful, thank you.

It is possible to label specific path like $HOME/tmp/known-hosts and allow nbdkit to access it.

What is inside this path? /home/jjongsma/work/libvirt/_build/src 

Can you please enable full auditing for detailed logs?

1) Open the /etc/audit/rules.d/audit.rules file in an editor.
2) Remove the following line if it exists:
-a task,never
3) Add the following line to the end of the file:
-w /etc/shadow -p w
4) Restart the audit daemon:
  # service auditd restart
5) Re-run your scenario.
6) Collect AVC denials:
  # ausearch -i -m avc,user_avc,selinux_err,user_selinux_err -ts today

Comment 27 Jonathon Jongsma 2023-04-18 18:13:06 UTC
(In reply to Nikola Knazekova from comment #25)
> Hi, 
> yes it was helpful, thank you.
> 
> It is possible to label specific path like $HOME/tmp/known-hosts and allow
> nbdkit to access it.

Sure, but what would we label it? Don't we need to include that in the policy?
 
> What is inside this path? /home/jjongsma/work/libvirt/_build/src 

This is simply the build directory for my checkout of the libvirt git repository. As part of bug #2016527, I'm adding nbdkit support to libvirt. So after I make changes to libvirt, I rebuild it and then run it from the working directory to do testing.

Comment 29 Richard W.M. Jones 2023-04-19 10:00:07 UTC
(In reply to Milos Malik from comment #24)
> Based on what I understood, running the nbdkit program from systemd is not
> very common use-case.
> 
> The most frequent use-case is running nbdkit as a subprocess of another
> program:
>  * https://libguestfs.org/nbdkit-captive.1.html
> 
> From SELinux point of view: unconfined root executes the nbdkit program and
> the nbdkit process then executes another program.
> 
> General question: if the other program is confined by SELinux, should it
> transition from the nbdkit_t type to its defined type?

AIUI for this specific bug -- creating a policy for nbdkit when launched from
virtqemud -- the libvirt daemon (virtqemud) is running under some confined
label, and it wants to run nbdkit.  We want to confine nbdkit under a different
label because it will be doing separate things from what libvirt can do,
which should enhance security.

eg. You don't want nbdkit to be able to start qemu processes, and you don't want
libvirt to be able to connect to remote SSH servers.

(In reply to Milos Malik from comment #28)
> From a SELinux QE point of view, it's very problematic to create a modular
> policy for the nbdkit program which would be specific for plugins/filters.
> 
> There is about 23 plugins and 39 filters in various nbdkit* packages.
> 
> Both the filters and the plugins are dynamically linked .so files.
> 
> Based on SELinux denials which I encountered, the nbdkit process does NOT
> execute other programs (which would enable an easy change of SELinux context
> on the newly executed process), but the nbdkit process loads the .so files
> and calls functions offered by them.

Yes it's true that nbdkit plugins are *.so files loaded into the same
process as nbdkit using dlopen.

If the only way to transition to a different SELinux context is to
run another program then that answers my question.

Incidentally, some *plugins* definitely do execute other programs, eg.
https://libguestfs.org/nbdkit-linuxdisk-plugin.1.html runs mkfs
https://libguestfs.org/nbdkit-sh-plugin.3.html runs anything
However those plugins are not going to be used by libvirt.

Comment 30 Daniel Berrangé 2023-04-19 12:13:34 UTC
(In reply to Richard W.M. Jones from comment #15)
> This is a question for the SELinux experts ...

snip

> Only one plugin may be loaded, and we know which one it is from the command
> line.
> Once a plugin has been loaded, it stays there until the program exits.
> 
> Can we adjust the SELinux policy applied (eg. from within nbdkit) once we
> know
> which plugin will be loaded?

Yes & no.

The SELinux context that a process runs under is determined at time of exec.

The default behaviour is that SELinux policy declares an automatic transition rule, eg when a process foo_t execs() a binary with file label bar_exec_t, the processed will get 'bar_t' as its runtime context

It is possible to override this behaviour by calling setexeccon(other_bar_t). The policy still has to be written to permit a transition from foo_t  to other_bar_t. We're just skipping the file label based lookup.

IOW, if nbdkit wants to self-adjust its policy it would have to see which plugin it has configured, call setexeccon() and then re-execve() itself to trigger the transition.

So we would end up with two execs + transitions:

  libvirtd (virtd_t)  ---exec(/usr/bin/nbdkit - nbdkit_exec_t)---> nbdkit (nbdkit_t)

  nbdkit (nbdkit_t)   ---setexeccon(nbdkit_plugin_curl_t)+exec(/usr/bin/nbdkit)---> nbdkit (nbdkit_plugin_curl_t)

The first is an automatic transition based on file label, the second is a manual transition nbdkit decides todo based on the plugin chosen

The alternative is to not try to confine nbdkit by default at all and instead just let libvirtd say exactly what to run so we only have 1 exec/transition:

  libvirtd (virtd_t)   ---setexeccon(nbdkit_plugin_curl_t)+exec(/usr/bin/nbdkit)---> nbdkit (nbdkit_plugin_curl_t)


There are some potential complications..

For curl/ssh I think all the file based resources that need to be accessed are common to every invokation and a finite predictable set. IOW, we can write the plugin policy to allow access and just "do the right thing".

For other plugins though, the files to be accessed are essentially arbitrary paths. eg the 'file' plugin can serve any file at all. Either we write th "file" plugin policy to allow it to serve pretty much any file, or we require the files to be given a label ahead of time (eg an nbdkit_image_t akin to virt_image_t). If we're doing the latter though, we don't want 2 parallel invokations of nbdkit to access each others' disk, so would need to use MCS like we do with svirt. eg  foo.qcow2 labelled nbdkit_image_t:c123,c321  and bar.qcow2 labelled nbdkit_image_t:c456,c654.  At that point the call to setexeccon() is mandatory, as it needs to specify the MCS to use  eg  nbdkit_plugin_file_t:c123,c321 to given access to foo.qcow2. The next question is who sets the MCS labels on the files. WOuld nbdkit set the MCS lablel on foo.qcow2 before re-exec'ing itself, or should the thing that calls nbdkit set it before hand. In the latter case, there's no point in nbdkit re-exec'ing itself as the caller is better off just using setexeccon(nbdkit_plugin_file_t:c123,c321) 

There are alot of nbdkit plugins and I doubt we're going to write policy for all of them straightaway.

If we make nbdkit always start off with a nbdkit_t type and then re-exec itself, then we need to have a fallback  "nbdkit_plugin_unconfined_t" that  allows pretty much everything. IOW, for curl/ssh it would use the locked down nbdkit_plugin_curl/ssh_t, but for other plugins it would use the unconfined variant

A more pragmatic (simpler/less disruptive) option would be to leave the entire nbdkit process unconfined indefinitely, and just expect libvirtd to use setexeccon() to transition directly to a nbdkit_plugin_curl_t / plugin_ssh_t type. IOW, only libvirt's usage would benefit from SELinux and other nbdkit users are unaffected

Comment 31 Richard W.M. Jones 2023-04-19 13:07:19 UTC
Thanks, very interesting.  Re-execing within nbdkit is already done for
one plugin (not for SELinux reasons, but because it is necessary for adjusting
LD_LIBRARY_PATH - https://gitlab.com/nbdkit/nbdkit/-/tree/master/plugins/vddk).
It has a number of subtle pitfalls and I'd prefer to avoid having to do it,
especially for every invocation.

On the other hand having libvirtd call setexeccon sounds a much better idea.

If we had a policy for a few well-known plugins, then we could even reuse
the same policies and labels with virt-v2v, so this wouldn't be a one-off
thing for libvirt.  Virt-v2v also uses ssh and curl plugins and could
benefit the same way as libvirt.

We could make the policy names discoverable, eg it would be easy to add
an selinux_label=... field to the output of:

$ nbdkit curl --dump-plugin
path=/usr/lib64/nbdkit/plugins/nbdkit-curl-plugin.so
name=curl
version=1.34.0
api_version=1
[etc]

I agree that labelling for nbdkit-file-plugin sounds very complicated, so
we could ignore that one for now.

There is a slight complication that filters may also access resources, but
I suspect we can ignore that by having a policy that allows some common
operations like writing to /tmp and /var/tmp, in the name of simplicity.

Comment 32 Jonathon Jongsma 2023-04-19 20:15:45 UTC
(In reply to Daniel Berrangé from comment #30)
> The alternative is to not try to confine nbdkit by default at all and
> instead just let libvirtd say exactly what to run so we only have 1
> exec/transition:
> 
>   libvirtd (virtd_t)  
> ---setexeccon(nbdkit_plugin_curl_t)+exec(/usr/bin/nbdkit)---> nbdkit
> (nbdkit_plugin_curl_t)

The thing I'm struggling with (as a relative selinux beginner) is where libvirt would get the appropriate value for the context that we need to set here? I don't think we would want to hardcode a literal 'nbdkit_plugin_curl_t' into libvirt source code, would we?. That seems a bit fragile since the libvirt source code would need to be updated if the selinux policy was ever changed. In the case of qemu, we don't hardcode 'svirt_t'. We get the appropriate process context (which happens to be svirt_t) by parsing the file returned by selinux_virtual_domain_context_path() and using that label for the qemu process. But what would we do for nbdkit? If we used the automatic transition based on the file label (nbdkit_t), that's fairly easy to determine by looking it up from the file (just like libvirt did for passst). But if we start adding non-automatic transitions like nbdkit_plugin_curl_t, how would we manage those?

Comment 33 Daniel Berrangé 2023-04-20 07:06:53 UTC
(In reply to Jonathon Jongsma from comment #32)
> The thing I'm struggling with (as a relative selinux beginner) is where
> libvirt would get the appropriate value for the context that we need to set
> here? I don't think we would want to hardcode a literal
> 'nbdkit_plugin_curl_t' into libvirt source code, would we?. That seems a bit
> fragile since the libvirt source code would need to be updated if the
> selinux policy was ever changed. In the case of qemu, we don't hardcode
> 'svirt_t'. We get the appropriate process context (which happens to be
> svirt_t) by parsing the file returned by
> selinux_virtual_domain_context_path() and using that label for the qemu
> process. But what would we do for nbdkit? 

Just use selinux_contexts_path() and concatenate  "nbdkit_plugins" and that file can have something like

$ cat nbdkit_plugins
curl:nbdkit_plugin_curl_t
ssh:nbdkit_plugin_ssh_t
....

Comment 34 Richard W.M. Jones 2023-04-20 07:50:01 UTC
I also suggested (comment 31) that we could get:

$ nbdkit <plugin> --dump-plugin

to print the right context (or nothing if there is no support for the plugin).  That would
allow us to adjust nbdkit dynamically as new SELinux support is added without needing
to change libvirt.  However I don't know if this is better than the method Daniel has suggested.

Comment 36 Jonathon Jongsma 2023-04-27 20:05:04 UTC
One problem that I'm currently running into with your draft policy is the following. Because libvirt uses MCS to isolate different guests from eachother, we need to set the categories properly when spawning these helper processes. And so for nbdkit, I'm following what is done for the passt utility, which is the following: 

When libvirt prepares to launch the child nbdkit process, it uses libselinux APIs to calculate what label we should use for the child process. So, in the function mentioned in [1], we have the following rough steps:
    getcon(&currentCon) // should return "virtd_t"
    ...
    getfilecon("/usr/sbin/nbdkit", &binaryCon) // should return "nbdkit_exec_t"
    ...
    security_compute_create(currentCon, binaryCon, string_to_security_class("process"), &naturalLabel) // should return "nbdkit_t"    

and then we would append the appropriate MCS range to that label and then spawn the process with that label. 

When I tested with my draft policy, it succeeded and I ended up with a label such as "nbdkit_t:s0:c228,c724". But when I test this with your policy, the security_compute_create() call fails. I'm sure why it is failing, and of course that failure doesn't produce any AVC logs.

[1] https://gitlab.com/libvirt/libvirt/-/blob/0324adb647885932efc97eefcfe08f6a8db60ae1/src/security/security_selinux.c#L565 - see function virSecuritySELinuxContextSetFromFile()

Comment 37 Jonathon Jongsma 2023-05-08 16:07:49 UTC
Nikola, any ideas about comment #36?

Comment 38 Nikola Knazekova 2023-05-09 13:25:46 UTC
Hi Jonathon, 

sorry for late response, I have to discuss this issue with colleagues.

For now I updated copr build (38.13-1.fc39.601), where MCS should be fixed. 

Can you please test it? 

Thank you

Comment 39 Jonathon Jongsma 2023-05-10 21:47:49 UTC
I still seem to be getting the same errors mentioned in comment 36. 

$ rpm -qa |grep selinux-policy
selinux-policy-targeted-38.13-1.fc37.601.noarch
selinux-policy-devel-38.13-1.fc37.601.noarch
selinux-policy-38.13-1.fc37.601.noarch

I get the following error from libvirt, which is printed because security_compute_create() fails:

  virSecuritySELinuxContextSetFromFile:591 : unable create new SELinux label based on label 'system_u:system_r:svirt_t:s0:c227,c659' and file '/usr/sbin/nbdkit': Permission denied

Comment 41 Nikola Knazekova 2023-05-23 12:42:22 UTC
Jonathon, can you check if there are any selinux errors?

# ausearch -i -m selinux_err

Comment 42 Jonathon Jongsma 2023-05-23 14:03:30 UTC
Oh, I actually do see something here:

type=PROCTITLE msg=audit(05/10/2023 16:41:09.113:2225) : proctitle=libvirtd --verbose 
type=SYSCALL msg=audit(05/10/2023 16:41:09.113:2225) : arch=x86_64 syscall=write success=no exit=EACCES(Permission denied) a0=0x2a a1=0x7f41a4032f90 a2=0x4d a3=0x7f41bcff9c47 items=0 ppid=1300984 pid=1301044 auid=jjongsma uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=pts1 ses=3 comm=rpc-libvirtd exe=/home/jjongsma/work/libvirt/_build/src/libvirtd subj=system_u:system_r:virtd_t:s0-s0:c0.c1023 key=(null) 
type=SELINUX_ERR msg=audit(05/10/2023 16:41:09.113:2225) : op=security_compute_sid invalid_context=system_u:system_r:nbdkit_t:s0-s0:c0.c1023 scontext=system_u:system_r:virtd_t:s0-s0:c0.c1023 tcontext=system_u:object_r:nbdkit_exec_t:s0 tclass=process

Comment 43 Nikola Knazekova 2023-05-23 15:31:38 UTC
Zdenko, Milos,

can you please look at this bug?

Comment 44 Zdenek Pytela 2023-05-23 16:51:45 UTC
I think nbdkit_t is not assigned to the system_r role.

Comment 45 Jonathon Jongsma 2023-06-02 15:37:40 UTC
Any update on this?

Comment 50 RHEL Program Management 2023-09-19 16:58:38 UTC
Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug.

Comment 51 RHEL Program Management 2023-09-19 16:59:05 UTC
This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there.

Due to differences in account names between systems, some fields were not replicated.  Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information.

To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer.  You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like:

"Bugzilla Bug" = 1234567

In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information.


Note You need to log in before you can comment on or make changes to this bug.