Bug 2334015
| Summary: | Since systemd-257.1-1.fc42 several services fail to start on Cloud images, including resolved (so name resolution fails) | ||
|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Adam Williamson <awilliam> |
| Component: | selinux-policy | Assignee: | Zdenek Pytela <zpytela> |
| Status: | CLOSED NOTABUG | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
| Severity: | high | Docs Contact: | |
| Priority: | medium | ||
| Version: | 42 | CC: | daan.j.demeyer, dwalsh, fedoraproject, lnykryn, luca.boccassi, lvrabec, mmalik, msekleta, omosnacek, pkoncity, robatino, ryncsn, suraj.ghimire7, systemd-maint, vmojzis, xiaofwan, yuwatana, zbyszek, zpytela |
| Target Milestone: | --- | Keywords: | Reopened |
| Target Release: | --- | Flags: | zpytela:
mirror+
|
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | openqa | ||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2025-10-08 12:10:39 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 2333743 | ||
|
Description
Adam Williamson
2024-12-24 17:06:59 UTC
ah, with a slightly different lens I see this:
[adamw@xps13a tmp]$ journalctl --file var/log/journal/174340b9a8494b6cad40972b7451ba6f/system.journal | grep -5 var/tmp
Dec 23 23:25:05 localhost systemd[1]: Starting systemd-resolved.service - Network Name Resolution...
Dec 23 23:25:05 localhost systemd[1]: Starting systemd-tmpfiles-setup-dev.service - Create Static Device Nodes in /dev...
Dec 23 23:25:05 localhost audit[639]: AVC avc: denied { add_name } for pid=639 comm="(emd-oomd)" name="tmp" scontext=system_u:system_r:init_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=dir permissive=0
Dec 23 23:25:05 localhost audit[639]: SYSCALL arch=c000003e syscall=258 success=no exit=-13 a0=ffffff9c a1=55a74cede800 a2=1ed a3=0 items=0 ppid=1 pid=639 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="(emd-oomd)" exe="/usr/lib/systemd/systemd-executor" subj=system_u:system_r:init_t:s0 key=(null)
Dec 23 23:25:05 localhost audit: PROCTITLE proctitle="(emd-oomd)"
Dec 23 23:25:05 localhost (emd-oomd)[639]: Failed to create destination mount point node '/run/systemd/mount-rootfs/var/tmp', ignoring: Permission denied
Dec 23 23:25:05 localhost (emd-oomd)[639]: Failed to mount /run/systemd/unit-private-tmp/var-tmp to /run/systemd/mount-rootfs/var/tmp: No such file or directory
Dec 23 23:25:05 localhost (emd-oomd)[639]: systemd-oomd.service: Failed to set up mount namespacing: /var/tmp: No such file or directory
Dec 23 23:25:05 localhost (emd-oomd)[639]: systemd-oomd.service: Failed at step NAMESPACE spawning /usr/lib/systemd/systemd-oomd: No such file or directory
Dec 23 23:25:05 localhost systemd[1]: systemd-oomd.service: Main process exited, code=exited, status=226/NAMESPACE
Dec 23 23:25:05 localhost systemd[1]: systemd-oomd.service: Failed with result 'exit-code'.
Dec 23 23:25:05 localhost systemd[1]: Failed to start systemd-oomd.service - Userspace Out-Of-Memory (OOM) Killer.
Dec 23 23:25:05 localhost audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-oomd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Dec 23 23:25:05 localhost audit: BPF prog-id=52 op=UNLOAD
Dec 23 23:25:05 localhost audit[641]: AVC avc: denied { add_name } for pid=641 comm="(resolved)" name="tmp" scontext=system_u:system_r:init_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=dir permissive=0
Dec 23 23:25:05 localhost audit[641]: SYSCALL arch=c000003e syscall=258 success=no exit=-13 a0=ffffff9c a1=55c7b82de420 a2=1ed a3=0 items=0 ppid=1 pid=641 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="(resolved)" exe="/usr/lib/systemd/systemd-executor" subj=system_u:system_r:init_t:s0 key=(null)
Dec 23 23:25:05 localhost audit: PROCTITLE proctitle="(resolved)"
Dec 23 23:25:05 localhost (resolved)[641]: Failed to create destination mount point node '/run/systemd/mount-rootfs/var/tmp', ignoring: Permission denied
Dec 23 23:25:05 localhost (resolved)[641]: Failed to mount /run/systemd/unit-private-tmp/var-tmp to /run/systemd/mount-rootfs/var/tmp: No such file or directory
Dec 23 23:25:05 localhost (resolved)[641]: systemd-resolved.service: Failed to set up mount namespacing: /var/tmp: No such file or directory
Dec 23 23:25:05 localhost (resolved)[641]: systemd-resolved.service: Failed at step NAMESPACE spawning /usr/lib/systemd/systemd-resolved: No such file or directory
Dec 23 23:25:05 localhost systemd[1]: systemd-resolved.service: Main process exited, code=exited, status=226/NAMESPACE
Dec 23 23:25:05 localhost systemd[1]: systemd-resolved.service: Failed with result 'exit-code'.
Dec 23 23:25:05 localhost systemd[1]: Failed to start systemd-resolved.service - Network Name Resolution.
so it looks like selinux is denying us here. But selinux-policy didn't change in the affected compose, I guess systemd's behaviour changed to something selinux does not allow. CCing the selinux-policy maintainer.
To fix some bugs, those services now use a private namespaced tmpfs instance for /tmp and /var/tmp instead of the one linked from /tmp/systemd-private-xxxx using the global tmpfs, so the policy needs to be updated accordingly FEDORA-2025-c146da0204 (systemd-257.2-4.fc42) has been submitted as an update to Fedora 42. https://bodhi.fedoraproject.org/updates/FEDORA-2025-c146da0204 FEDORA-2025-c146da0204 (systemd-257.2-4.fc42) has been pushed to the Fedora 42 stable repository. If problem still persists, please make note of it in this bug report. We have the work-around in place in systemd, so the problem is suppressed for now. But we still want to have the selinux policy updated, so I'll reopen the bug. Let's drop the blocker nomination since it's workarounded now. (In reply to Zbigniew Jędrzejewski-Szmek from comment #5) > We have the work-around in place in systemd, so the problem is suppressed > for now. > But we still want to have the selinux policy updated, so I'll reopen the bug. With this, you mean once a different systmed fix is in place? Currently I cannot see any related problem with selinux-policy-41.29-1.fc42.noarch systemd-257.2-6.fc42.x86_64 systemd-257.2-4.fc42 reverted the change, or more precisely, dropped the use of the new directive in systemd units. Presumably, if reverted the revert, the AVC would show up again. This should be easy to test by adding '[Service] PrivateTmp=disconnected' e.g. systemd-resolved.service. This bug appears to have been reported against 'rawhide' during the Fedora Linux 42 development cycle. Changing version to 42. Hmm, I wanted to drop the revert patch for systemd-258-rc1, but it seems that the selinux policy hasn't been updated yet. So I'll keep it for now. But we don't want to carry it indefinitely. Ping Zdenek, can we fix this? Thanks. (In reply to Zbigniew Jędrzejewski-Szmek from comment #8) > systemd-257.2-4.fc42 reverted the change, or more precisely, dropped the use > of the new directive in systemd units. > Presumably, if reverted the revert, the AVC would show up again. > > This should be easy to test by adding '[Service] PrivateTmp=disconnected' > e.g. systemd-resolved.service. How can I get a better picture, especially see avc denials? journal is almost of no use What is the current reproducer? Setting privatetmp to disconnected does not make any change. https://github.com/coreos/fedora-coreos-tracker/issues/1857 has more information. > the bug was specific to certain environments (Cloud and CoreOS, not sure what the common attribute is?) > Dec 23 23:25:05 localhost audit[639]: AVC avc: denied { add_name } for pid=639 comm="(emd-oomd)" name="tmp" scontext=system_u:system_r:init_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=dir permissive=0 This indicates that /var/tmp is unlabelled, which doesn't seem right. So maybe there's some problem with the image construction. (I now tried in a "normal" VM, and indeed the issue does not reproduce.) I'd like to move this issue forward as soon as there are some data. Audit logs are way better than journal, but I know the service is usually not available on coreos. So a clear reproducer is needed. Certainly unlabeled file on a normal filesystem does not feel right, such things just should not happen. I could get the logs easily enough, but the affected images have been garbage-collected for a long time now. The issue was definitely specific to Cloud images, other images were not affected. If you can do a systemd scratch build with the workaround omitted, I might be able to get a Cloud image built somehow or other and test it. https://src.fedoraproject.org/rpms/systemd/pull-request/210 has the patch dropped. I just pushed it, so the builds should be done in a few hours. So I updated and again no denial appears on a common vm, so it seems special setup is really neded. PrivateTmp is disconnected. wget https://kojipkgs.fedoraproject.org//work/tasks/1594/135211594/systemd-258~rc1-2.fc43.x86_64.rpm https://kojipkgs.fedoraproject.org//work/tasks/1594/135211594/systemd-libs-258~rc1-2.fc43.x86_64.rpm https://kojipkgs.fedoraproject.org//work/tasks/1594/135211594/systemd-shared-258~rc1-2.fc43.x86_64.rpm https://kojipkgs.fedoraproject.org//work/tasks/1594/135211594/systemd-sysusers-258~rc1-2.fc43.x86_64.rpm https://kojipkgs.fedoraproject.org//work/tasks/1594/135211594/systemd-pam-258~rc1-2.fc43.x86_64.rpm https://kojipkgs.fedoraproject.org//work/tasks/1594/135211594/systemd-udev-258~rc1-2.fc43.x86_64.rpm https://kojipkgs.fedoraproject.org//work/tasks/1594/135211594/systemd-networkd-258~rc1-2.fc43.x86_64.rpm https://kojipkgs.fedoraproject.org//work/tasks/1594/135211594/systemd-resolved-258~rc1-2.fc43.x86_64.rpm I cannot reproduce this issue on F42 with systemd-257.7-1.fc42.x86_64 neither on F44 with systemd-258~rc3-2.fc44.x86_64 This commit probably has fixed the issue: https://github.com/systemd/systemd/pull/36100/commits/c1f6c1f238f121ff70c442323f1541d1115b6e1c and I can confirm it is both in F42 and F44. Please check if anything related still appears, otherwise I will close this bz as done. *** Bug 2333848 has been marked as a duplicate of this bug. *** Oh, sorry, I never got around to it. I think the systemd-level workaround is probably still in place, so testing on plain Fedora is not sufficient. The PR linked above is still open. I'll *try* and get around to this but I have a zillion other things rn :/ I am not sure if I understand.
If I take all current sources, edit systemd-resolved to contain
[Service]
PrivateTmp=disconnected
and restart the service, no denial appears, no related journal entries, no unlabeled_t anywhere.
In the sources I can see
systemd-257.7/src/core/namespace.c
1802 /* Since mount() will always follow symlinks we chase the symlinks on our own first. Note
1803 * that bind mount source paths are always relative to the host root, hence we pass NULL as
1804 * root directory to chase() here. */
1805
1806 /* When we create implicit mounts, we might need to create the path ourselves as it is on a
1807 * just-created tmpfs, for example. */
1808 if (m->create_source_dir) {
1809 r = mkdir_p(mount_entry_source(m), m->source_dir_mode);
1810 if (r < 0)
1811 return log_debug_errno(r, "Failed to create source directory %s: %m", mount_entry_source(m));
1812
1813 r = label_fix_full(AT_FDCWD, mount_entry_source(m), mount_entry_unprefixed_path(m), /* flags= */ 0);
so I suppose the issue cannot manifest now, with https://src.fedoraproject.org/rpms/systemd/pull-request/210 merged or not.
see also
(In reply to Zbigniew Jędrzejewski-Szmek from comment #8)
> systemd-257.2-4.fc42 reverted the change, or more precisely, dropped the use
> of the new directive in systemd units.
> Presumably, if reverted the revert, the AVC would show up again.
>
> This should be easy to test by adding '[Service] PrivateTmp=disconnected'
> e.g. systemd-resolved.service.
Zbyszek, are you still expecting an SELinux policy change here? Closing, if you think the issue is still in place, please reopen or create a new bug. Considering what Zdenek wrote, I think it's all good. I'll try dropping the work-around patch in one of the builds when we're in a known-good state. |