Bug 2250930 - SELinux is preventing systemd-userdbd from map_read, map_write access on the bpf labeled init_t.
Summary: SELinux is preventing systemd-userdbd from map_read, map_write access on the ...
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: systemd
Version: rawhide
Hardware: x86_64
OS: Unspecified
unspecified
medium
Target Milestone: ---
Assignee: systemd-maint
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: abrt_hash:2ea2f7a20eb6d9e1adf4ee9d482...
: 2250932 2250933 2250947 2251042 2251302 2252117 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-11-21 20:56 UTC by Mikhail
Modified: 2023-12-02 02:23 UTC (History)
27 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-12-02 02:23:14 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
File: description (1.96 KB, text/plain)
2023-11-21 20:56 UTC, Mikhail
no flags Details
File: os_info (770 bytes, text/plain)
2023-11-21 20:56 UTC, Mikhail
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github cockpit-project bots pull 5595 0 None open Image refresh for fedora-rawhide 2023-11-27 05:43:21 UTC
Github systemd systemd pull 30170 0 None Merged core: pass bpf_outer_map_fd to sd-executor only if RestrictFileSystems was set 2023-12-01 16:48:42 UTC
Github systemd systemd pull 30266 0 None Merged Make sure we close bpf outer map fd in systemd-executor 2023-12-01 16:48:42 UTC

Description Mikhail 2023-11-21 20:56:36 UTC
Description of problem:
Update systemd from 254.5-2.fc40 to 255~rc2-1.fc40 version
SELinux is preventing systemd-userdbd from map_read, map_write access on the bpf labeled init_t.

*****  Plugin catchall (100. confidence) suggests   **************************

If you believe that systemd-userdbd should be allowed map_read map_write access on bpf labeled init_t by default.
Then you should report this as a bug.
You can generate a local policy module to allow this access.
Do
allow this access for now by executing:
# ausearch -c 'systemd-userdbd' --raw | audit2allow -M my-systemduserdbd
# semodule -X 300 -i my-systemduserdbd.pp

Additional Information:
Source Context                system_u:system_r:systemd_userdbd_t:s0
Target Context                system_u:system_r:init_t:s0
Target Objects                Unknown [ bpf ]
Source                        systemd-userdbd
Source Path                   systemd-userdbd
Port                          <Unknown>
Host                          (removed)
Source RPM Packages           
Target RPM Packages           
SELinux Policy RPM            selinux-policy-targeted-40.5-1.fc40.noarch
Local Policy RPM              selinux-policy-targeted-40.5-1.fc40.noarch
Selinux Enabled               True
Policy Type                   targeted
Enforcing Mode                Permissive
Host Name                     (removed)
Platform                      Linux (removed) 6.7.0-0.rc2.22.fc40.x86_64+debug
                              #1 SMP PREEMPT_DYNAMIC Mon Nov 20 14:05:16 UTC
                              2023 x86_64
Alert Count                   1
First Seen                    2023-11-22 01:55:20 +05
Last Seen                     2023-11-22 01:55:20 +05
Local ID                      29583649-9c34-4a40-baf9-6e29e99bfee3

Raw Audit Messages
type=AVC msg=audit(1700600120.484:1623): avc:  denied  { map_read map_write } for  pid=540775 comm="systemd-userdbd" scontext=system_u:system_r:systemd_userdbd_t:s0 tcontext=system_u:system_r:init_t:s0 tclass=bpf permissive=1


Hash: systemd-userdbd,systemd_userdbd_t,init_t,bpf,map_read,map_write

Version-Release number of selected component:
selinux-policy-targeted-40.5-1.fc40.noarch

Additional info:
reporter:       libreport-2.17.11
reason:         SELinux is preventing systemd-userdbd from map_read, map_write access on the bpf labeled init_t.
package:        selinux-policy-targeted-40.5-1.fc40.noarch
component:      selinux-policy
hashmarkername: setroubleshoot
type:           libreport
kernel:         6.7.0-0.rc2.22.fc40.x86_64+debug
comment:        Update systemd from 254.5-2.fc40 to 255~rc2-1.fc40 version
component:      selinux-policy

Comment 1 Mikhail 2023-11-21 20:56:39 UTC
Created attachment 2000749 [details]
File: description

Comment 2 Mikhail 2023-11-21 20:56:41 UTC
Created attachment 2000750 [details]
File: os_info

Comment 3 Zdenek Pytela 2023-11-22 07:43:23 UTC
Looks like every service now requires bpf:

----
type=PROCTITLE msg=audit(11/22/2023 02:38:53.547:100) : proctitle=/usr/sbin/sshd -D 
type=PATH msg=audit(11/22/2023 02:38:53.547:100) : item=1 name=/lib64/ld-linux-x86-64.so.2 inode=139475 dev=fc:02 mode=file,755 ouid=root ogid=root rdev=00:00 obj=system_u:object_r:ld_so_t:s0 nametype=NORMAL cap_fp=none cap_fi=none cap_fe=0 cap_fver=0 cap_frootid=0 
type=PATH msg=audit(11/22/2023 02:38:53.547:100) : item=0 name=/usr/sbin/sshd inode=162518 dev=fc:02 mode=file,755 ouid=root ogid=root rdev=00:00 obj=system_u:object_r:sshd_exec_t:s0 nametype=NORMAL cap_fp=none cap_fi=none cap_fe=0 cap_fver=0 cap_frootid=0 
type=CWD msg=audit(11/22/2023 02:38:53.547:100) : cwd=/ 
type=EXECVE msg=audit(11/22/2023 02:38:53.547:100) : argc=2 a0=/usr/sbin/sshd a1=-D 
type=SYSCALL msg=audit(11/22/2023 02:38:53.547:100) : arch=x86_64 syscall=execve success=yes exit=0 a0=0x55e1896b9a90 a1=0x55e1896b9b30 a2=0x55e1896b98b0 a3=0x55e1896b9bc0 items=2 ppid=1 pid=726 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=sshd exe=/usr/sbin/sshd subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 key=(null) 
type=AVC msg=audit(11/22/2023 02:38:53.547:100) : avc:  denied  { map_read map_write } for  pid=726 comm=sshd scontext=system_u:system_r:sshd_t:s0-s0:c0.c1023 tcontext=system_u:system_r:init_t:s0 tclass=bpf permissive=0

Comment 4 Zdenek Pytela 2023-11-22 07:44:12 UTC
*** Bug 2250947 has been marked as a duplicate of this bug. ***

Comment 5 Zdenek Pytela 2023-11-22 07:44:29 UTC
*** Bug 2250933 has been marked as a duplicate of this bug. ***

Comment 6 Zdenek Pytela 2023-11-22 07:53:04 UTC
*** Bug 2250932 has been marked as a duplicate of this bug. ***

Comment 7 Zdenek Pytela 2023-11-22 16:12:04 UTC
*** Bug 2251042 has been marked as a duplicate of this bug. ***

Comment 8 Ondrej Mosnáček 2023-11-23 09:05:21 UTC
This looks like systemd is failing to close some BPF map/prog file descriptor(s) before executing services (O_CLOEXEC?).

See also: https://pagure.io/fedora-ci/general/issue/447

Comment 9 Zdenek Pytela 2023-11-23 10:32:16 UTC
 
----
type=PROCTITLE msg=audit(11/23/2023 05:28:41.166:95) : proctitle=/usr/sbin/sshd -D 
type=PATH msg=audit(11/23/2023 05:28:41.166:95) : item=1 name=/lib64/ld-linux-x86-64.so.2 inode=139475 dev=fc:02 mode=file,755 ouid=root ogid=root rdev=00:00 obj=system_u:object_r:ld_so_t:s0 nametype=NORMAL cap_fp=none cap_fi=none cap_fe=0 cap_fver=0 cap_frootid=0 
type=PATH msg=audit(11/23/2023 05:28:41.166:95) : item=0 name=/usr/sbin/sshd inode=162518 dev=fc:02 mode=file,755 ouid=root ogid=root rdev=00:00 obj=system_u:object_r:sshd_exec_t:s0 nametype=NORMAL cap_fp=none cap_fi=none cap_fe=0 cap_fver=0 cap_frootid=0 
type=CWD msg=audit(11/23/2023 05:28:41.166:95) : cwd=/ 
type=EXECVE msg=audit(11/23/2023 05:28:41.166:95) : argc=2 a0=/usr/sbin/sshd a1=-D 
type=SYSCALL msg=audit(11/23/2023 05:28:41.166:95) : arch=x86_64 syscall=execve success=yes exit=0 a0=0x55ec16ae5b10 a1=0x55ec16ae5bb0 a2=0x55ec16ae58b0 a3=0x55ec16ae5c40 items=2 ppid=1 pid=742 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=sshd exe=/usr/sbin/sshd subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 key=(null) 
type=AVC msg=audit(11/23/2023 05:28:41.166:95) : avc:  denied  { map_read map_write } for  pid=742 comm=sshd scontext=system_u:system_r:sshd_t:s0-s0:c0.c1023 tcontext=system_u:system_r:init_t:s0 tclass=bpf permissive=0 
----

----
type=PROCTITLE msg=audit(11/23/2023 05:27:02.228:381) : proctitle=/usr/bin/mandb -q 
type=PATH msg=audit(11/23/2023 05:27:02.228:381) : item=1 name=/lib64/ld-linux-x86-64.so.2 inode=139475 dev=fc:02 mode=file,755 ouid=root ogid=root rdev=00:00 obj=system_u:object_r:ld_so_t:s0 nametype=NORMAL cap_fp=none cap_fi=none cap_fe=0 cap_fver=0 cap_frootid=0 
type=PATH msg=audit(11/23/2023 05:27:02.228:381) : item=0 name=/usr/bin/mandb inode=162560 dev=fc:02 mode=file,755 ouid=root ogid=root rdev=00:00 obj=system_u:object_r:mandb_exec_t:s0 nametype=NORMAL cap_fp=none cap_fi=none cap_fe=0 cap_fver=0 cap_frootid=0 
type=CWD msg=audit(11/23/2023 05:27:02.228:381) : cwd=/ 
type=EXECVE msg=audit(11/23/2023 05:27:02.228:381) : argc=2 a0=/usr/bin/mandb a1=-q 
type=SYSCALL msg=audit(11/23/2023 05:27:02.228:381) : arch=x86_64 syscall=execve success=yes exit=0 a0=0x55f5f879b990 a1=0x55f5f8791900 a2=0x55f5f87872c0 a3=0x55f5f8791900 items=2 ppid=2036 pid=2039 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=mandb exe=/usr/bin/mandb subj=system_u:system_r:mandb_t:s0 key=(null) 
type=AVC msg=audit(11/23/2023 05:27:02.228:381) : avc:  denied  { map_read map_write } for  pid=2039 comm=mandb scontext=system_u:system_r:mandb_t:s0 tcontext=system_u:system_r:init_t:s0 tclass=bpf permissive=0 


mandb    2039 [000]  5054.237000: avc:selinux_audited: requested=0x6 denied=0x6 audited=0x6 resul>
        ffffffff8e70ad35 avc_audit_post_callback+0x205 ([kernel.kallsyms])
        ffffffff8e70ad35 avc_audit_post_callback+0x205 ([kernel.kallsyms])
        ffffffff8e733d6f common_lsm_audit+0x2af ([kernel.kallsyms])
        ffffffff8e70bf4c slow_avc_audit+0xbc ([kernel.kallsyms])
        ffffffff8e70c7c1 avc_has_perm+0xc1 ([kernel.kallsyms])
        ffffffff8e70e1b8 file_has_perm+0xa8 ([kernel.kallsyms])
        ffffffff8e7120b4 match_file+0x34 ([kernel.kallsyms])
        ffffffff8e49b621 iterate_fd+0x61 ([kernel.kallsyms])
        ffffffff8e7101b9 selinux_bprm_committing_creds+0xf9 ([kernel.kallsyms])
        ffffffff8e705833 security_bprm_committing_creds+0x23 ([kernel.kallsyms])
        ffffffff8e47cda5 begin_new_exec+0x6b5 ([kernel.kallsyms])
        ffffffff8e4fef3d load_elf_binary+0x2bd ([kernel.kallsyms])
        ffffffff8e47ac34 bprm_execve+0x294 ([kernel.kallsyms])
        ffffffff8e47c34d do_execveat_common.isra.0+0x1ad ([kernel.kallsyms])
        ffffffff8e47d236 __x64_sys_execve+0x36 ([kernel.kallsyms])
        ffffffff8eff0461 do_syscall_64+0x61 ([kernel.kallsyms])
        ffffffff8f2000ea entry_SYSCALL_64_after_hwframe+0x6e ([kernel.kallsyms])

Comment 10 Luca Boccassi 2023-11-23 11:30:33 UTC
The BPF FD 'bpf_outer_map_fd' is 'special' in some imperscrutable way, and cannot be closed without raising asserts. It needs to be looked at by somebody with an advanced understanding of the kernel's BPF internals. Until that happens, yes the access will need to be allowed.

Comment 11 Andrei Stepanov 2023-11-23 11:52:20 UTC
@asavkov do you have ideas if this is an issue on the BPF side?  Thank you in advance.

Comment 12 Luca Boccassi 2023-11-23 13:23:49 UTC
A possible option to reduce impact could be to serialize the bpf_outer_map_fd over to sd-executor only when some other option is enabled, however I don't really use any of that bpf filtering options, so I would feel more comfortable if someone who understood them tested that stuff still works after doing such a change

Comment 13 Artem Savkov 2023-11-23 13:55:51 UTC
(In reply to Andrei Stepanov from comment #11)
> @asavkov do you have ideas if this is an issue on the BPF side? 

it is not

> Thank you in advance.

you are welcome

Comment 14 Ondrej Mosnáček 2023-11-23 14:15:12 UTC
(In reply to Luca Boccassi from comment #10)
> The BPF FD 'bpf_outer_map_fd' is 'special' in some imperscrutable way, and
> cannot be closed without raising asserts.

What asserts? Can't systemd do fcntl(bpf_outer_map_fd, F_SETFD, O_CLOEXEC) before executing the service binary? (I presume it has to temporarily unset the flag in order to pass the fd from the daemon to the executor.) Sorry if I'm asking dumb questions - I don't know much about systemd internals, just trying to understand the problem...

Comment 15 Luca Boccassi 2023-11-23 19:18:17 UTC
The BPF FD is somehow 'special', and closing it in the child broke the parent's copy too, so things start to fail left and right. It's especially difficult as this feature has no tests.
This is a purely speculative change that makes it pass the FD over only if the feature is actually enabled for the service, which should help reduce the impact: https://github.com/systemd/systemd/pull/30170

Comment 16 Zdenek Pytela 2023-11-24 08:21:59 UTC
*** Bug 2251302 has been marked as a duplicate of this bug. ***

Comment 17 Ondrej Mosnáček 2023-11-24 12:43:24 UTC
(In reply to Luca Boccassi from comment #15)
> The BPF FD is somehow 'special', and closing it in the child broke the
> parent's copy too, so things start to fail left and right. It's especially
> difficult as this feature has no tests.

That smells of a kernel bug that you are just papering over... (Maybe it's more tricky and you really need to hold that fd, but so far I don't understand why.)

I just tried building systemd with the below patch and it made the SELinux denials go away, while not producing any visible errors. I even tried adding RestrictFileSystems= to one of the services (and restarting it a couple of times) and still no problems seen (and the filesystem blocking worked when I specified a too narrow set). Is there something else needed to reproduce the problem? Maybe it has been fixed on the kernel side in the meantime?

diff --git a/src/core/execute-serialize.c b/src/core/execute-serialize.c
index 342883994a..5bc903082a 100644
--- a/src/core/execute-serialize.c
+++ b/src/core/execute-serialize.c
@@ -1637,6 +1637,11 @@ static int exec_parameters_deserialize(ExecParameters *p, FILE *f, FDSet *fds) {
                         if (fd < 0)
                                 continue;
 
+                        /* DEBUG */
+                        r = fd_cloexec(fd, true);
+                        if (r < 0)
+                                return r;
+
                         p->bpf_outer_map_fd = fd;
                 } else if ((val = startswith(l, "exec-parameters-notify-socket="))) {
                         r = free_and_strdup(&p->notify_socket, val);

Comment 18 Luca Boccassi 2023-11-24 12:52:24 UTC
I have only vague memories as it was a few months ago, as mentioned I don't really use BPF anywhere so I cannot say for sure. If you are confident after that change the BPF features are still functioning, then please do send a PR on Github. Please bear in mind there are no tests for those BPF features so it has to be validated manually.

Comment 19 Lukas Javorsky 2023-11-30 09:09:52 UTC
*** Bug 2252117 has been marked as a duplicate of this bug. ***

Comment 20 Adam Williamson 2023-12-01 16:35:14 UTC
ping, any progress on this? the failing 'installability' CI jobs are inconveniencing quite a few folks.

Since the kernel was mentioned, let's CC jforbes...

Comment 21 Luca Boccassi 2023-12-01 16:37:23 UTC
Solved in main, will be in RC4 as soon as we tag it

Comment 22 Adam Williamson 2023-12-01 16:40:55 UTC
well hey, that sounds like backport time to me! thanks.

Comment 23 Adam Williamson 2023-12-01 17:52:30 UTC
Should be fixed by https://bodhi.fedoraproject.org/updates/FEDORA-2023-32e53ae9b1 , please test.

Comment 24 Ian Laurie 2023-12-01 23:40:18 UTC
systemd-255~rc3-4.fc40 fixes the issue for me on one bare-metal and 4 virtual systems.

Comment 25 Adam Williamson 2023-12-02 02:23:14 UTC
Great, let's call it fixed. If anyone still has issues, yell and we can reopen. Build will be in the next Rawhide compose.


Note You need to log in before you can comment on or make changes to this bug.