As of F41 (I double-upgraded from F37 -> F39 -> F41 in a couple of days so I don't actually know when this started), podman appears to default to pasta for networking, which has a bunch of AVCs on my (unconfined-disabled) config. Here's what happens when I do a podman build command; nothing comparable happens if I add "--network=slirp4netns" to the build command: type=AVC msg=audit(1733378448.443:31171): avc: denied { setgid } for pid=650400 comm="pasta.avx2" capability=6 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tclass=cap_userns permissive=1 type=AVC msg=audit(1733378448.444:31172): avc: denied { setuid } for pid=650400 comm="pasta.avx2" capability=7 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tclass=cap_userns permissive=1 type=AVC msg=audit(1733378448.444:31173): avc: denied { read } for pid=650400 comm="pasta.avx2" name="ns" dev="proc" ino=2908438 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=1 type=AVC msg=audit(1733378448.444:31173): avc: denied { open } for pid=650400 comm="pasta.avx2" path="/proc/650397/ns" dev="proc" ino=2908438 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=1 type=AVC msg=audit(1733378449.444:31179): avc: denied { read } for pid=650408 comm="pasta.avx2" name="net" dev="proc" ino=2908439 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=system_u:system_r:container_t:s0:c340,c874 tclass=lnk_file permissive=1 type=AVC msg=audit(1733378457.444:31195): avc: denied { read } for pid=650408 comm="pasta.avx2" name="net" dev="proc" ino=2908439 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=system_u:system_r:container_t:s0:c340,c874 tclass=lnk_file permissive=1 type=AVC msg=audit(1733378462.444:31212): avc: denied { read } for pid=650408 comm="pasta.avx2" name="net" dev="proc" ino=2908439 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=system_u:system_r:container_t:s0:c340,c874 tclass=lnk_file permissive=1 type=AVC msg=audit(1733378471.444:31230): avc: denied { read } for pid=650408 comm="pasta.avx2" name="net" dev="proc" ino=2908439 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=system_u:system_r:container_t:s0:c340,c874 tclass=lnk_file permissive=1 type=AVC msg=audit(1733378475.756:31236): avc: denied { setgid } for pid=651281 comm="pasta.avx2" capability=6 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tclass=cap_userns permissive=1 type=AVC msg=audit(1733378475.756:31237): avc: denied { setuid } for pid=651281 comm="pasta.avx2" capability=7 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tclass=cap_userns permissive=1 type=AVC msg=audit(1733378475.757:31238): avc: denied { read } for pid=651281 comm="pasta.avx2" name="ns" dev="proc" ino=2902721 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=1 type=AVC msg=audit(1733378475.757:31238): avc: denied { open } for pid=651281 comm="pasta.avx2" path="/proc/651261/ns" dev="proc" ino=2902721 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=1 type=AVC msg=audit(1733378476.444:31239): avc: denied { read } for pid=650408 comm="pasta.avx2" name="net" dev="proc" ino=2908439 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=system_u:system_r:container_t:s0:c340,c874 tclass=lnk_file permissive=1 type=AVC msg=audit(1733378476.757:31240): avc: denied { read } for pid=651292 comm="pasta.avx2" name="net" dev="proc" ino=2902722 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=system_u:system_r:container_t:s0:c383,c810 tclass=lnk_file permissive=1 type=AVC msg=audit(1733378478.196:31246): avc: denied { setgid } for pid=651523 comm="pasta.avx2" capability=6 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tclass=cap_userns permissive=1 type=AVC msg=audit(1733378478.196:31247): avc: denied { setuid } for pid=651523 comm="pasta.avx2" capability=7 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tclass=cap_userns permissive=1 type=AVC msg=audit(1733378479.196:31250): avc: denied { read } for pid=651530 comm="pasta.avx2" name="net" dev="proc" ino=2909444 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=system_u:system_r:container_t:s0:c383,c810 tclass=lnk_file permissive=1 type=AVC msg=audit(1733378479.444:31251): avc: denied { read } for pid=650408 comm="pasta.avx2" name="net" dev="proc" ino=2908439 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=system_u:system_r:container_t:s0:c340,c874 tclass=lnk_file permissive=1 type=AVC msg=audit(1733378481.634:31252): avc: denied { read } for pid=651770 comm="pasta.avx2" name="ns" dev="proc" ino=2909598 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=1 type=AVC msg=audit(1733378481.634:31252): avc: denied { open } for pid=651770 comm="pasta.avx2" path="/proc/651769/ns" dev="proc" ino=2909598 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=1 type=AVC msg=audit(1733378482.320:31256): avc: denied { setgid } for pid=651955 comm="pasta.avx2" capability=6 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tclass=cap_userns permissive=1 type=AVC msg=audit(1733378482.320:31257): avc: denied { setuid } for pid=651955 comm="pasta.avx2" capability=7 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tclass=cap_userns permissive=1 type=AVC msg=audit(1733378482.320:31258): avc: denied { read } for pid=651955 comm="pasta.avx2" name="ns" dev="proc" ino=2904841 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=1 type=AVC msg=audit(1733378482.320:31258): avc: denied { open } for pid=651955 comm="pasta.avx2" path="/proc/651954/ns" dev="proc" ino=2904841 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=1 type=AVC msg=audit(1733378482.444:31259): avc: denied { read } for pid=650408 comm="pasta.avx2" name="net" dev="proc" ino=2908439 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=system_u:system_r:container_t:s0:c340,c874 tclass=lnk_file permissive=1 type=AVC msg=audit(1733378483.444:31275): avc: denied { read } for pid=650408 comm="pasta.avx2" name="net" dev="proc" ino=2908439 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=system_u:system_r:container_t:s0:c340,c874 tclass=lnk_file permissive=1 Reproducible: Always Steps to Reproduce: See details Actual Results: I have to use `setenforce 0` to run `podman build` Expected Results: Should work when enforcing
Really no idea if podman or passt was the right component here.
Do you have container-selinux installed? What is the label on podman (ls -Z /usr/bin/podman)? When started from podman pasta does not run under its own profile AFAIK and instead uses the podman one (container_runtime_t) and not pasta_t. So this one not cause such problems. For problems with pasta_t that would be an issue with pasta (passt rpm), ideally the default pasta_t profile would work when run from podman and we could stop running pasta with container_runtime_t cc @sbrivio
rlpowell@stodi> rpm -qa | grep -iP '(podman|passt|container)' | sort container-selinux-2.234.2-1.fc41.noarch container-storage-setup-0.11.0-16.dev.git413b408.fc41.noarch containernetworking-plugins-1.6.0-1.fc41.x86_64 containers-common-0.61.0-1.fc41.noarch containers-common-extra-0.61.0-1.fc41.noarch passt-0^20241127.gc0fbc7e-1.fc41.x86_64 passt-selinux-0^20241127.gc0fbc7e-1.fc41.noarch plexus-containers-component-annotations-2.2.0-4.fc41.noarch podman-5.3.1-1.fc41.x86_64 podman-plugins-4.9.4-1.fc39.x86_64 systemd-container-256.9-2.fc41.x86_64 rlpowell@stodi> ls -Z /usr/bin/podman system_u:object_r:container_runtime_exec_t:s0 /usr/bin/podman I already tried re-installing container-selinux. I haven't forced a relabel, because all the relevant labels look good, and also it only breaks deep in the process. I have been running like this (unconfined disabled) for some time now and am reasonably experienced; feel free to ask me to try complex stuff if it helps.
Do y'all need any further information?
(In reply to Robin Powell from comment #4) > Do y'all need any further information? Sorry, I don't know yet, I still need to look into this, that should happen soon.
Robin, I'm finally looking into this and I have a question. As Paul mentioned, running pasta as pasta_t will generally not work for Podman usage (we'll have to take care of https://bugs.passt.top/show_bug.cgi?id=81 first -- patches welcome, of course), but the current trick (and reason why that issue is not *that* critical) is that pasta will run as container_runtime_t when used by Podman, and things work. This is not happening in your case, but I'm failing to reproduce it, hence the question, finally: how did you set up your unconfined disabled policy? Is there something in particular you touched in the container-selinux policy, or base policy only...? And if yes, what exactly?
The basic step is `semodule -d unconfined`. I then do a bunch of stuff to paper over various issues that result from that. Let me see about getting you a minimal repro that has a minimum of system changes.
(In reply to Robin Powell from comment #7) > The basic step is `semodule -d unconfined`. I then do a bunch of stuff to > paper over various issues that result from that. Let me see about getting > you a minimal repro that has a minimum of system changes. Oh, actually, no need! I was able to reproduce this with `semodule -d unconfined` and then running (not building) a simple container.
Oh, yay! :D
Observations (element of surprise at 5.): 1. /usr/bin/slirp4netns has type bin_t, /usr/bin/pasta has type pasta_exec_t (both expected: slirp4netns doesn't have its own SELinux policy, pasta does) 2. with the "unconfined" policy module loaded, both slirp4netns and pasta run as type container_runtime_t. This is expected for both slirp4netns and for pasta (but we want to change it for pasta because it's unnecessarily loose, see https://bugs.passt.top/show_bug.cgi?id=81) 3. the "unconfined" policy module uses, as optional_policy: container_runtime_domtrans(unconfined_service_t) which calls, in turn: domtrans_pattern($1, container_runtime_exec_t, container_runtime_t) 4. container-selinux ships an optional policy (in container.te) with: optional_policy(` gen_require(` role unconfined_r; type unconfined_service_t; type unconfined_service_exec_t; ') [...] container_runtime_domtrans(unconfined_service_t) [...] but with the "unconfined" module not loaded, as far as I can tell, unconfined_service_t is not provided. As a result of this and 3., if the unconfined module is not loaded, there's no defined domain transition to container_runtime_t 5. as a result of 4., with the "unconfined" policy module *not* loaded, slirp4netns runs with type unconfined_t (!), while pasta runs as pasta_t (not expected, but it's something we want to fix anyway, https://bugs.passt.top/show_bug.cgi?id=81). It's not only slirp4netns: this also happens with crun. Example with unconfined module loaded: # ps -auxZ|grep crun unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 sbrivio 861368 0.0 0.0 5668 1972 ? Ss 23:36 0:00 /usr/bin/crun --systemd-cgroup create --bundle /var/tmp/buildah469275804 --pid-file /var/tmp/buildah469275804/pid --no-new-keyring buildah-buildah469275804 and with the unconfined module *not* loaded: # ps -auxZ|grep crun unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 sbrivio 857431 0.0 0.0 5668 1908 ? Ss 23:34 0:00 /usr/bin/crun --systemd-cgroup create --bundle /var/tmp/buildah1225367596 --pid-file /var/tmp/buildah1225367596/pid --no-new-keyring buildah-buildah1225367596 ...no, I didn't swap these two examples. 6. as reported by Robin, similar to what Paul reported at https://bugs.passt.top/show_bug.cgi?id=81#c0, if we build a container without the unconfined module, we get (now with pasta's debug output): $ podman build --net=pasta:-d -t test . [...] 0.0027: netns dir open: Permission denied, exiting and: type=AVC msg=audit(1734995687.692:66388): avc: denied { read } for pid=853340 comm="pasta.avx2" name="ns" dev="proc" ino=3930918 scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=0 with an option like --netns /proc/853339/ns/net, where 853339 is the PID of crun, and tcontext is unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 (note the unconfined_t type) because crun runs as unconfined now, see 5. We get further denials (reported by Robin): those are all related to accessing the network namespace directory at /proc/853339/ns/net. 7. as reported by Robin, same as reported by Paul at https://bugs.passt.top/show_bug.cgi?id=81#c0, we get further denials for setgid() and setuid(), which are actually harmless. Example in enforcing mode: type=AVC msg=audit(1734958422.982:66249): avc: denied { setgid } for pid=840077 comm="passt.avx2" capability=6 scontext=unconfined_u:unconfined_r:passt_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:passt_t:s0-s0:c0.c1023 tclass=cap_userns permissive=0 type=AVC msg=audit(1734958422.983:66250): avc: denied { setgid } for pid=840077 comm="passt.avx2" capability=6 scontext=unconfined_u:unconfined_r:passt_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:passt_t:s0-s0:c0.c1023 tclass=cap_userns permissive=0 type=AVC msg=audit(1734958422.983:66251): avc: denied { setuid } for pid=840077 comm="passt.avx2" capability=7 scontext=unconfined_u:unconfined_r:passt_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:passt_t:s0-s0:c0.c1023 tclass=cap_userns permissive=0 ...note that pasta happily continues after this. That's because, for simplicity, we call setgroups(), setgid(), and setuid() using the current UID and GID, in pasta's isolate_user(): https://passt.top/passt/tree/isolation.c?id=e5ba8adef71ec53e192373ed1267dc338719dda0#n218 ...we could skip those calls if we're already not running as root and no --runas option is given, but, POSIXly (1003.1-2024): The setgid() function shall fail if: [EINVAL] The value of the gid argument is invalid and is not supported by the implementation. [EPERM] The process does not have appropriate privileges and gid does not match the real group ID or the saved set-group-ID. note the "and gid does not match [...]" part. Same for setuid(). That's why we unconditionally call those. This is different for setgroups() (not POSIX), and that's why we explicitly handle EPERM for it. So all in all I see three issues: a. disabling the "unconfined" module shouldn't lead to processes started by Podman to run as unconfined (!) instead of container_runtime_t. This is the case for slirp4netns and crun (see observations 4. and 5. above). It also shouldn't lead to pasta to run as pasta_t instead of container_runtime_t. This is the actual issue in this ticket and, to solve it, we should ensure the transition to container_runtime_t even if the unconfined module is not loaded. I can send a patch (for container-selinux I suppose) if desired b. if pasta runs as type pasta_t, it won't be able to open /proc/PID/ns/net, where PID is the PID of crun. This is already tracked at https://bugs.passt.top/show_bug.cgi?id=81. I can't fix this entirely in pasta's policy, because pasta's policy doesn't know where the network namespace reference to open is stored. I'll have to an interface that Podman could use to allow pasta to open the network namespace. I can prioritise fixing this, but probably we should fix a. anyway, and with a. fixed this is somewhat less important c. SELinux logs denials for setuid() and setgid() operations which are not actually failing in their POSIX sense, and those are quite confusing. While we could simply drop the gratuitous setuid() and setgid() calls from pasta, I would argue that the proper fix is to partially revert (kernel) commit 5626d3e86141 ("selinux: remove hooks which simply defer to capabilities"), and modify the return value of the (old/new) selinux_task_setuid() and selinux_task_setgid() to also factor in the check performed by __is_setuid() / __is_setgid(). I'll send a patch for this unless there are objections.
@lsm5 @dwalsh Thoughts on a) the container_runtime_t transition issue? b) of course would be nice to fix eventually so we can run pasta more locked down c) I don't have enough context for the kernel parts. But while it might be simpler in your code to call it, not doing so would safe two syscalls. Given it is only called once on pasta start-up it is most likely not noticeable though so it should not matter much.
I would advise SELinux-policy to add: $ cat mypol.te policy_module(mypol, 1.0) require { type passt_t; type pasta_t; } #============= passt_t ============== allow passt_t self:cap_userns { setgid setuid }; #============= pasta_t ============== domain_getattr_all_entry_files(pasta_t) allow pasta_t self:cap_userns { setgid setuid }; unconfined_list_dirs(pasta_t)
(In reply to Daniel Walsh from comment #12) > I would advise SELinux-policy to add: > > $ cat mypol.te > > policy_module(mypol, 1.0) > > require { > type passt_t; > type pasta_t; > } > > #============= passt_t ============== > allow passt_t self:cap_userns { setgid setuid }; > > #============= pasta_t ============== > domain_getattr_all_entry_files(pasta_t) > allow pasta_t self:cap_userns { setgid setuid }; > unconfined_list_dirs(pasta_t) Hmm, why? Note that passt and pasta ship their own SELinux policies. As to setgid and setuid, see my point c. in comment #10: I don't think they should be granted.