Bug 2334015 - Since systemd-257.1-1.fc42 several services fail to start on Cloud images, including resolved (so name resolution fails)
Summary: Since systemd-257.1-1.fc42 several services fail to start on Cloud images, in...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: selinux-policy
Version: 42
Hardware: All
OS: Linux
medium
high
Target Milestone: ---
Assignee: Zdenek Pytela
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: openqa
: 2333848 (view as bug list)
Depends On:
Blocks: 2333743
TreeView+ depends on / blocked
 
Reported: 2024-12-24 17:06 UTC by Adam Williamson
Modified: 2025-10-09 11:24 UTC (History)
19 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2025-10-08 12:10:39 UTC
Type: Bug
Embargoed:
zpytela: mirror+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FC-1859 0 None None None 2025-07-24 07:44:00 UTC

Description Adam Williamson 2024-12-24 17:06:59 UTC
Since systemd-257.1-1.fc42 appeared in Fedora-Rawhide-20241221.n.1, three services fail to start on boot of the Cloud image (booting in a local VM using a cloud-init ISO image): systemd-oomd.service , systemd-oomd.socket and systemd-resolved.service . In the journal we see this for systemd-oomd.service:

[adamw@xps13a tmp]$ journalctl --file var/log/journal/174340b9a8494b6cad40972b7451ba6f/system.journal -u systemd-oomd.service
Dec 23 23:25:05 localhost systemd[1]: systemd-oomd.service: Main process exited, code=exited, status=226/NAMESPACE
Dec 23 23:25:05 localhost systemd[1]: systemd-oomd.service: Failed with result 'exit-code'.
Dec 23 23:25:05 localhost systemd[1]: Failed to start systemd-oomd.service - Userspace Out-Of-Memory (OOM) Killer.
Dec 23 23:25:05 localhost systemd[1]: systemd-oomd.service: Scheduled restart job, restart counter is at 1.
Dec 23 23:25:05 localhost systemd[1]: Starting systemd-oomd.service - Userspace Out-Of-Memory (OOM) Killer...
Dec 23 23:25:05 localhost (emd-oomd)[639]: Failed to create destination mount point node '/run/systemd/mount-rootfs/var/tmp', ignoring: Permission denied
Dec 23 23:25:05 localhost (emd-oomd)[639]: Failed to mount /run/systemd/unit-private-tmp/var-tmp to /run/systemd/mount-rootfs/var/tmp: No such file or directory
Dec 23 23:25:05 localhost (emd-oomd)[639]: systemd-oomd.service: Failed to set up mount namespacing: /var/tmp: No such file or directory
Dec 23 23:25:05 localhost (emd-oomd)[639]: systemd-oomd.service: Failed at step NAMESPACE spawning /usr/lib/systemd/systemd-oomd: No such file or directory
Dec 23 23:25:05 localhost systemd[1]: systemd-oomd.service: Main process exited, code=exited, status=226/NAMESPACE
Dec 23 23:25:05 localhost systemd[1]: systemd-oomd.service: Failed with result 'exit-code'.
Dec 23 23:25:05 localhost systemd[1]: Failed to start systemd-oomd.service - Userspace Out-Of-Memory (OOM) Killer.

It then retries five times with the same result. For resolved, we see:

Dec 23 23:25:05 localhost systemd[1]: Starting systemd-resolved.service - Network Name Resolution...
Dec 23 23:25:05 localhost (resolved)[641]: Failed to create destination mount point node '/run/systemd/mount-rootfs/var/tmp', ignoring: Permission denied
Dec 23 23:25:05 localhost (resolved)[641]: Failed to mount /run/systemd/unit-private-tmp/var-tmp to /run/systemd/mount-rootfs/var/tmp: No such file or directory
Dec 23 23:25:05 localhost (resolved)[641]: systemd-resolved.service: Failed to set up mount namespacing: /var/tmp: No such file or directory
Dec 23 23:25:05 localhost (resolved)[641]: systemd-resolved.service: Failed at step NAMESPACE spawning /usr/lib/systemd/systemd-resolved: No such file or directory
Dec 23 23:25:05 localhost systemd[1]: systemd-resolved.service: Main process exited, code=exited, status=226/NAMESPACE
Dec 23 23:25:05 localhost systemd[1]: systemd-resolved.service: Failed with result 'exit-code'.
Dec 23 23:25:05 localhost systemd[1]: Failed to start systemd-resolved.service - Network Name Resolution.

...and similarly it retries five times.

Since resolved.service fails to start, network name resolution does not work, which is obviously a big problem.

Proposing as a Beta blocker per Basic criterion "Standard network functions such as address resolution and connections with common protocols such as ping, HTTP and ssh must work as expected" - https://fedoraproject.org/wiki/Basic_Release_Criteria#Basic_networking

Comment 1 Adam Williamson 2024-12-24 17:10:04 UTC
ah, with a slightly different lens I see this:

[adamw@xps13a tmp]$ journalctl --file var/log/journal/174340b9a8494b6cad40972b7451ba6f/system.journal | grep -5 var/tmp
Dec 23 23:25:05 localhost systemd[1]: Starting systemd-resolved.service - Network Name Resolution...
Dec 23 23:25:05 localhost systemd[1]: Starting systemd-tmpfiles-setup-dev.service - Create Static Device Nodes in /dev...
Dec 23 23:25:05 localhost audit[639]: AVC avc:  denied  { add_name } for  pid=639 comm="(emd-oomd)" name="tmp" scontext=system_u:system_r:init_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=dir permissive=0
Dec 23 23:25:05 localhost audit[639]: SYSCALL arch=c000003e syscall=258 success=no exit=-13 a0=ffffff9c a1=55a74cede800 a2=1ed a3=0 items=0 ppid=1 pid=639 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="(emd-oomd)" exe="/usr/lib/systemd/systemd-executor" subj=system_u:system_r:init_t:s0 key=(null)
Dec 23 23:25:05 localhost audit: PROCTITLE proctitle="(emd-oomd)"
Dec 23 23:25:05 localhost (emd-oomd)[639]: Failed to create destination mount point node '/run/systemd/mount-rootfs/var/tmp', ignoring: Permission denied
Dec 23 23:25:05 localhost (emd-oomd)[639]: Failed to mount /run/systemd/unit-private-tmp/var-tmp to /run/systemd/mount-rootfs/var/tmp: No such file or directory
Dec 23 23:25:05 localhost (emd-oomd)[639]: systemd-oomd.service: Failed to set up mount namespacing: /var/tmp: No such file or directory
Dec 23 23:25:05 localhost (emd-oomd)[639]: systemd-oomd.service: Failed at step NAMESPACE spawning /usr/lib/systemd/systemd-oomd: No such file or directory
Dec 23 23:25:05 localhost systemd[1]: systemd-oomd.service: Main process exited, code=exited, status=226/NAMESPACE
Dec 23 23:25:05 localhost systemd[1]: systemd-oomd.service: Failed with result 'exit-code'.
Dec 23 23:25:05 localhost systemd[1]: Failed to start systemd-oomd.service - Userspace Out-Of-Memory (OOM) Killer.
Dec 23 23:25:05 localhost audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-oomd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Dec 23 23:25:05 localhost audit: BPF prog-id=52 op=UNLOAD
Dec 23 23:25:05 localhost audit[641]: AVC avc:  denied  { add_name } for  pid=641 comm="(resolved)" name="tmp" scontext=system_u:system_r:init_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=dir permissive=0
Dec 23 23:25:05 localhost audit[641]: SYSCALL arch=c000003e syscall=258 success=no exit=-13 a0=ffffff9c a1=55c7b82de420 a2=1ed a3=0 items=0 ppid=1 pid=641 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="(resolved)" exe="/usr/lib/systemd/systemd-executor" subj=system_u:system_r:init_t:s0 key=(null)
Dec 23 23:25:05 localhost audit: PROCTITLE proctitle="(resolved)"
Dec 23 23:25:05 localhost (resolved)[641]: Failed to create destination mount point node '/run/systemd/mount-rootfs/var/tmp', ignoring: Permission denied
Dec 23 23:25:05 localhost (resolved)[641]: Failed to mount /run/systemd/unit-private-tmp/var-tmp to /run/systemd/mount-rootfs/var/tmp: No such file or directory
Dec 23 23:25:05 localhost (resolved)[641]: systemd-resolved.service: Failed to set up mount namespacing: /var/tmp: No such file or directory
Dec 23 23:25:05 localhost (resolved)[641]: systemd-resolved.service: Failed at step NAMESPACE spawning /usr/lib/systemd/systemd-resolved: No such file or directory
Dec 23 23:25:05 localhost systemd[1]: systemd-resolved.service: Main process exited, code=exited, status=226/NAMESPACE
Dec 23 23:25:05 localhost systemd[1]: systemd-resolved.service: Failed with result 'exit-code'.
Dec 23 23:25:05 localhost systemd[1]: Failed to start systemd-resolved.service - Network Name Resolution.

so it looks like selinux is denying us here. But selinux-policy didn't change in the affected compose, I guess systemd's behaviour changed to something selinux does not allow. CCing the selinux-policy maintainer.

Comment 2 Luca Boccassi 2024-12-26 11:11:35 UTC
To fix some bugs, those services now use a private namespaced tmpfs instance for /tmp and /var/tmp instead of the one linked from /tmp/systemd-private-xxxx using the global tmpfs, so the policy needs to be updated accordingly

Comment 3 Fedora Update System 2025-01-11 09:31:19 UTC
FEDORA-2025-c146da0204 (systemd-257.2-4.fc42) has been submitted as an update to Fedora 42.
https://bodhi.fedoraproject.org/updates/FEDORA-2025-c146da0204

Comment 4 Fedora Update System 2025-01-11 11:04:57 UTC
FEDORA-2025-c146da0204 (systemd-257.2-4.fc42) has been pushed to the Fedora 42 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 5 Zbigniew Jędrzejewski-Szmek 2025-01-11 11:11:56 UTC
We have the work-around in place in systemd, so the problem is suppressed for now.
But we still want to have the selinux policy updated, so I'll reopen the bug.

Comment 6 Adam Williamson 2025-01-11 17:26:25 UTC
Let's drop the blocker nomination since it's workarounded now.

Comment 7 Zdenek Pytela 2025-01-21 12:11:42 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #5)
> We have the work-around in place in systemd, so the problem is suppressed
> for now.
> But we still want to have the selinux policy updated, so I'll reopen the bug.

With this, you mean once a different systmed fix is in place?
Currently I cannot see any related problem with 

selinux-policy-41.29-1.fc42.noarch
systemd-257.2-6.fc42.x86_64

Comment 8 Zbigniew Jędrzejewski-Szmek 2025-01-21 12:27:57 UTC
systemd-257.2-4.fc42 reverted the change, or more precisely, dropped the use of the new directive in systemd units.
Presumably, if reverted the revert, the AVC would show up again.

This should be easy to test by adding '[Service] PrivateTmp=disconnected' e.g. systemd-resolved.service.

Comment 9 Aoife Moloney 2025-02-26 13:20:37 UTC
This bug appears to have been reported against 'rawhide' during the Fedora Linux 42 development cycle.
Changing version to 42.

Comment 10 Zbigniew Jędrzejewski-Szmek 2025-07-23 20:40:30 UTC
Hmm, I wanted to drop the revert patch for systemd-258-rc1, but it seems that the selinux policy hasn't been updated yet. So I'll keep it for now. But we don't want to carry it indefinitely.

Comment 11 Adam Williamson 2025-07-23 20:52:44 UTC
Ping Zdenek, can we fix this? Thanks.

Comment 12 Zdenek Pytela 2025-07-24 07:38:30 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #8)
> systemd-257.2-4.fc42 reverted the change, or more precisely, dropped the use
> of the new directive in systemd units.
> Presumably, if reverted the revert, the AVC would show up again.
> 
> This should be easy to test by adding '[Service] PrivateTmp=disconnected'
> e.g. systemd-resolved.service.

How can I get a better picture, especially see avc denials?
journal is almost of no use

What is the current reproducer? Setting privatetmp to disconnected does not make any change.

Comment 13 Zbigniew Jędrzejewski-Szmek 2025-07-24 08:56:26 UTC
https://github.com/coreos/fedora-coreos-tracker/issues/1857 has more information.
> the bug was specific to certain environments (Cloud and CoreOS, not sure what the common attribute is?)

> Dec 23 23:25:05 localhost audit[639]: AVC avc:  denied  { add_name } for  pid=639 comm="(emd-oomd)" name="tmp" scontext=system_u:system_r:init_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=dir permissive=0

This indicates that /var/tmp is unlabelled, which doesn't seem right. So maybe there's some
problem with the image construction.

(I now tried in a "normal" VM, and indeed the issue does not reproduce.)

Comment 14 Zdenek Pytela 2025-07-24 09:09:26 UTC
I'd like to move this issue forward as soon as there are some data.
Audit logs are way better than journal, but I know the service is usually not available on coreos.
So a clear reproducer is needed.

Certainly unlabeled file on a normal filesystem does not feel right, such things just should not happen.

Comment 15 Adam Williamson 2025-07-24 18:01:47 UTC
I could get the logs easily enough, but the affected images have been garbage-collected for a long time now. The issue was definitely specific to Cloud images, other images were not affected.

If you can do a systemd scratch build with the workaround omitted, I might be able to get a Cloud image built somehow or other and test it.

Comment 16 Zbigniew Jędrzejewski-Szmek 2025-07-24 18:12:54 UTC
https://src.fedoraproject.org/rpms/systemd/pull-request/210 has the patch dropped.
I just pushed it, so the builds should be done in a few hours.

Comment 18 Zdenek Pytela 2025-09-02 17:57:13 UTC
I cannot reproduce this issue on F42 with systemd-257.7-1.fc42.x86_64
neither on F44 with systemd-258~rc3-2.fc44.x86_64

This commit probably has fixed the issue:
https://github.com/systemd/systemd/pull/36100/commits/c1f6c1f238f121ff70c442323f1541d1115b6e1c

and I can confirm it is both in F42 and F44.

Please check if anything related still appears, otherwise I will close this bz as done.

Comment 19 Zdenek Pytela 2025-09-02 17:57:36 UTC
*** Bug 2333848 has been marked as a duplicate of this bug. ***

Comment 20 Adam Williamson 2025-09-02 18:33:34 UTC
Oh, sorry, I never got around to it. I think the systemd-level workaround is probably still in place, so testing on plain Fedora is not sufficient. The PR linked above is still open. I'll *try* and get around to this but I have a zillion other things rn :/

Comment 21 Zdenek Pytela 2025-09-02 19:57:05 UTC
I am not sure if I understand.
If I take all current sources, edit systemd-resolved to contain

[Service]
PrivateTmp=disconnected

and restart the service, no denial appears, no related journal entries, no unlabeled_t anywhere.
In the sources I can see

systemd-257.7/src/core/namespace.c
1802                 /* Since mount() will always follow symlinks we chase the symlinks on our own first. Note
1803                  * that bind mount source paths are always relative to the host root, hence we pass NULL as
1804                  * root directory to chase() here. */
1805 
1806                 /* When we create implicit mounts, we might need to create the path ourselves as it is on a
1807                  * just-created tmpfs, for example. */
1808                 if (m->create_source_dir) {
1809                         r = mkdir_p(mount_entry_source(m), m->source_dir_mode);
1810                         if (r < 0)
1811                                 return log_debug_errno(r, "Failed to create source directory %s: %m", mount_entry_source(m));
1812 
1813                         r = label_fix_full(AT_FDCWD, mount_entry_source(m), mount_entry_unprefixed_path(m), /* flags= */ 0);

so I suppose the issue cannot manifest now, with https://src.fedoraproject.org/rpms/systemd/pull-request/210 merged or not.

see also
(In reply to Zbigniew Jędrzejewski-Szmek from comment #8)
> systemd-257.2-4.fc42 reverted the change, or more precisely, dropped the use
> of the new directive in systemd units.
> Presumably, if reverted the revert, the AVC would show up again.
> 
> This should be easy to test by adding '[Service] PrivateTmp=disconnected'
> e.g. systemd-resolved.service.

Comment 22 Adam Williamson 2025-09-04 01:25:51 UTC
Zbyszek, are you still expecting an SELinux policy change here?

Comment 23 Zdenek Pytela 2025-10-08 12:10:39 UTC
Closing, if you think the issue is still in place, please reopen or create a new bug.

Comment 24 Zbigniew Jędrzejewski-Szmek 2025-10-09 11:24:49 UTC
Considering what Zdenek wrote, I think it's all good.
I'll try dropping the work-around patch in one of the builds when we're in a known-good state.


Note You need to log in before you can comment on or make changes to this bug.