Bug 2330512 - podman build causes AVCs with default pasta networking [NEEDINFO]
Summary: podman build causes AVCs with default pasta networking
Keywords:
Status: ASSIGNED
Alias: None
Product: Fedora
Classification: Fedora
Component: selinux-policy
Version: 41
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Zdenek Pytela
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-12-05 06:12 UTC by Robin Powell
Modified: 2025-01-07 12:34 UTC (History)
19 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed:
Type: ---
Embargoed:
sbrivio: needinfo? (sbrivio)
pholzing: needinfo? (lsm5)
sbrivio: needinfo? (dwalsh)


Attachments (Terms of Use)

Description Robin Powell 2024-12-05 06:12:46 UTC
As of F41 (I double-upgraded from F37 -> F39 -> F41 in a couple of days so I don't actually know when this started), podman appears to default to pasta for networking, which has a bunch of AVCs on my (unconfined-disabled) config.  Here's what happens when I do a podman build command; nothing comparable happens if I add "--network=slirp4netns" to the build command:

type=AVC msg=audit(1733378448.443:31171): avc:  denied  { setgid } for  pid=650400 comm="pasta.avx2" capability=6  scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tclass=cap_userns permissive=1
type=AVC msg=audit(1733378448.444:31172): avc:  denied  { setuid } for  pid=650400 comm="pasta.avx2" capability=7  scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tclass=cap_userns permissive=1
type=AVC msg=audit(1733378448.444:31173): avc:  denied  { read } for  pid=650400 comm="pasta.avx2" name="ns" dev="proc" ino=2908438 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=1
type=AVC msg=audit(1733378448.444:31173): avc:  denied  { open } for  pid=650400 comm="pasta.avx2" path="/proc/650397/ns" dev="proc" ino=2908438 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=1
type=AVC msg=audit(1733378449.444:31179): avc:  denied  { read } for  pid=650408 comm="pasta.avx2" name="net" dev="proc" ino=2908439 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=system_u:system_r:container_t:s0:c340,c874 tclass=lnk_file permissive=1
type=AVC msg=audit(1733378457.444:31195): avc:  denied  { read } for  pid=650408 comm="pasta.avx2" name="net" dev="proc" ino=2908439 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=system_u:system_r:container_t:s0:c340,c874 tclass=lnk_file permissive=1
type=AVC msg=audit(1733378462.444:31212): avc:  denied  { read } for  pid=650408 comm="pasta.avx2" name="net" dev="proc" ino=2908439 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=system_u:system_r:container_t:s0:c340,c874 tclass=lnk_file permissive=1
type=AVC msg=audit(1733378471.444:31230): avc:  denied  { read } for  pid=650408 comm="pasta.avx2" name="net" dev="proc" ino=2908439 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=system_u:system_r:container_t:s0:c340,c874 tclass=lnk_file permissive=1
type=AVC msg=audit(1733378475.756:31236): avc:  denied  { setgid } for  pid=651281 comm="pasta.avx2" capability=6  scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tclass=cap_userns permissive=1
type=AVC msg=audit(1733378475.756:31237): avc:  denied  { setuid } for  pid=651281 comm="pasta.avx2" capability=7  scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tclass=cap_userns permissive=1
type=AVC msg=audit(1733378475.757:31238): avc:  denied  { read } for  pid=651281 comm="pasta.avx2" name="ns" dev="proc" ino=2902721 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=1
type=AVC msg=audit(1733378475.757:31238): avc:  denied  { open } for  pid=651281 comm="pasta.avx2" path="/proc/651261/ns" dev="proc" ino=2902721 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=1
type=AVC msg=audit(1733378476.444:31239): avc:  denied  { read } for  pid=650408 comm="pasta.avx2" name="net" dev="proc" ino=2908439 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=system_u:system_r:container_t:s0:c340,c874 tclass=lnk_file permissive=1
type=AVC msg=audit(1733378476.757:31240): avc:  denied  { read } for  pid=651292 comm="pasta.avx2" name="net" dev="proc" ino=2902722 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=system_u:system_r:container_t:s0:c383,c810 tclass=lnk_file permissive=1
type=AVC msg=audit(1733378478.196:31246): avc:  denied  { setgid } for  pid=651523 comm="pasta.avx2" capability=6  scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tclass=cap_userns permissive=1
type=AVC msg=audit(1733378478.196:31247): avc:  denied  { setuid } for  pid=651523 comm="pasta.avx2" capability=7  scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tclass=cap_userns permissive=1
type=AVC msg=audit(1733378479.196:31250): avc:  denied  { read } for  pid=651530 comm="pasta.avx2" name="net" dev="proc" ino=2909444 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=system_u:system_r:container_t:s0:c383,c810 tclass=lnk_file permissive=1
type=AVC msg=audit(1733378479.444:31251): avc:  denied  { read } for  pid=650408 comm="pasta.avx2" name="net" dev="proc" ino=2908439 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=system_u:system_r:container_t:s0:c340,c874 tclass=lnk_file permissive=1
type=AVC msg=audit(1733378481.634:31252): avc:  denied  { read } for  pid=651770 comm="pasta.avx2" name="ns" dev="proc" ino=2909598 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=1
type=AVC msg=audit(1733378481.634:31252): avc:  denied  { open } for  pid=651770 comm="pasta.avx2" path="/proc/651769/ns" dev="proc" ino=2909598 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=1
type=AVC msg=audit(1733378482.320:31256): avc:  denied  { setgid } for  pid=651955 comm="pasta.avx2" capability=6  scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tclass=cap_userns permissive=1
type=AVC msg=audit(1733378482.320:31257): avc:  denied  { setuid } for  pid=651955 comm="pasta.avx2" capability=7  scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tclass=cap_userns permissive=1
type=AVC msg=audit(1733378482.320:31258): avc:  denied  { read } for  pid=651955 comm="pasta.avx2" name="ns" dev="proc" ino=2904841 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=1
type=AVC msg=audit(1733378482.320:31258): avc:  denied  { open } for  pid=651955 comm="pasta.avx2" path="/proc/651954/ns" dev="proc" ino=2904841 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=1
type=AVC msg=audit(1733378482.444:31259): avc:  denied  { read } for  pid=650408 comm="pasta.avx2" name="net" dev="proc" ino=2908439 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=system_u:system_r:container_t:s0:c340,c874 tclass=lnk_file permissive=1
type=AVC msg=audit(1733378483.444:31275): avc:  denied  { read } for  pid=650408 comm="pasta.avx2" name="net" dev="proc" ino=2908439 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=system_u:system_r:container_t:s0:c340,c874 tclass=lnk_file permissive=1


Reproducible: Always

Steps to Reproduce:
See details
Actual Results:  
I have to use `setenforce 0` to run `podman build`

Expected Results:  
Should work when enforcing

Comment 1 Robin Powell 2024-12-05 06:13:16 UTC
Really no idea if podman or passt was the right component here.

Comment 2 Paul Holzinger 2024-12-05 11:15:48 UTC
Do you have container-selinux installed? What is the label on podman (ls -Z /usr/bin/podman)?
When started from podman pasta does not run under its own profile AFAIK and instead uses the podman one (container_runtime_t) and not pasta_t. So this one not cause such problems.

For problems with pasta_t that would be an issue with pasta (passt rpm), ideally the default pasta_t profile would work when run from podman and we could stop running pasta with container_runtime_t
cc @sbrivio

Comment 3 Robin Powell 2024-12-05 16:24:45 UTC
rlpowell@stodi> rpm -qa | grep -iP '(podman|passt|container)' | sort
container-selinux-2.234.2-1.fc41.noarch
container-storage-setup-0.11.0-16.dev.git413b408.fc41.noarch
containernetworking-plugins-1.6.0-1.fc41.x86_64
containers-common-0.61.0-1.fc41.noarch
containers-common-extra-0.61.0-1.fc41.noarch
passt-0^20241127.gc0fbc7e-1.fc41.x86_64
passt-selinux-0^20241127.gc0fbc7e-1.fc41.noarch
plexus-containers-component-annotations-2.2.0-4.fc41.noarch
podman-5.3.1-1.fc41.x86_64
podman-plugins-4.9.4-1.fc39.x86_64
systemd-container-256.9-2.fc41.x86_64

rlpowell@stodi> ls -Z /usr/bin/podman
system_u:object_r:container_runtime_exec_t:s0 /usr/bin/podman

I already tried re-installing container-selinux.  I haven't forced a relabel, because all the relevant labels look good, and also it only breaks deep in the process.

I have been running like this (unconfined disabled) for some time now and am reasonably experienced; feel free to ask me to try complex stuff if it helps.

Comment 4 Robin Powell 2024-12-11 02:04:35 UTC
Do y'all need any further information?

Comment 5 Stefano Brivio 2024-12-11 10:29:32 UTC
(In reply to Robin Powell from comment #4)
> Do y'all need any further information?

Sorry, I don't know yet, I still need to look into this, that should happen soon.

Comment 6 Stefano Brivio 2024-12-19 18:30:49 UTC
Robin, I'm finally looking into this and I have a question. As Paul mentioned, running pasta as pasta_t will generally not work for Podman usage (we'll have to take care of https://bugs.passt.top/show_bug.cgi?id=81 first -- patches welcome, of course), but the current trick (and reason why that issue is not *that* critical) is that pasta will run as container_runtime_t when used by Podman, and things work.

This is not happening in your case, but I'm failing to reproduce it, hence the question, finally: how did you set up your unconfined disabled policy? Is there something in particular you touched in the container-selinux policy, or base policy only...? And if yes, what exactly?

Comment 7 Robin Powell 2024-12-19 19:13:45 UTC
The basic step is `semodule -d unconfined`.  I then do a bunch of stuff to paper over various issues that result from that.  Let me see about getting you a minimal repro that has a minimum of system changes.

Comment 8 Stefano Brivio 2024-12-20 14:03:10 UTC
(In reply to Robin Powell from comment #7)
> The basic step is `semodule -d unconfined`.  I then do a bunch of stuff to
> paper over various issues that result from that.  Let me see about getting
> you a minimal repro that has a minimum of system changes.

Oh, actually, no need! I was able to reproduce this with `semodule -d unconfined` and then running (not building) a simple container.

Comment 9 Robin Powell 2024-12-20 19:20:31 UTC
Oh, yay!  :D

Comment 10 Stefano Brivio 2024-12-24 14:23:28 UTC
Observations (element of surprise at 5.):

1. /usr/bin/slirp4netns has type bin_t, /usr/bin/pasta has type pasta_exec_t
   (both expected: slirp4netns doesn't have its own SELinux policy, pasta does)

2. with the "unconfined" policy module loaded, both slirp4netns
   and pasta run as type container_runtime_t. This is expected for both
   slirp4netns and for pasta (but we want to change it for pasta because it's
   unnecessarily loose, see https://bugs.passt.top/show_bug.cgi?id=81)

3. the "unconfined" policy module uses, as optional_policy:

       container_runtime_domtrans(unconfined_service_t)

   which calls, in turn:

       domtrans_pattern($1, container_runtime_exec_t, container_runtime_t)

4. container-selinux ships an optional policy (in container.te) with:

       optional_policy(`
               gen_require(`
                       role unconfined_r;
                       type unconfined_service_t;
                       type unconfined_service_exec_t;
               ')

       [...]

       container_runtime_domtrans(unconfined_service_t)

       [...]

   but with the "unconfined" module not loaded, as far as I can tell,
   unconfined_service_t is not provided. As a result of this and 3., if the
   unconfined module is not loaded, there's no defined domain transition to
   container_runtime_t

5. as a result of 4., with the "unconfined" policy module *not* loaded,
   slirp4netns runs with type unconfined_t (!), while pasta runs as pasta_t (not
   expected, but it's something we want to fix anyway,
   https://bugs.passt.top/show_bug.cgi?id=81).

   It's not only slirp4netns: this also happens with crun. Example with
   unconfined module loaded:

       # ps -auxZ|grep crun
       unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 sbrivio 861368 0.0  0.0 5668 1972 ? Ss 23:36   0:00 /usr/bin/crun --systemd-cgroup create --bundle /var/tmp/buildah469275804 --pid-file /var/tmp/buildah469275804/pid --no-new-keyring buildah-buildah469275804

   and with the unconfined module *not* loaded:

       # ps -auxZ|grep crun
       unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 sbrivio 857431 0.0  0.0 5668 1908 ? Ss 23:34   0:00 /usr/bin/crun --systemd-cgroup create --bundle /var/tmp/buildah1225367596 --pid-file /var/tmp/buildah1225367596/pid --no-new-keyring buildah-buildah1225367596

   ...no, I didn't swap these two examples.

6. as reported by Robin, similar to what Paul reported at
   https://bugs.passt.top/show_bug.cgi?id=81#c0, if we build a container without
   the unconfined module, we get (now with pasta's debug output):

       $ podman build --net=pasta:-d -t test .
       [...]
       0.0027: netns dir open: Permission denied, exiting

   and:

       type=AVC msg=audit(1734995687.692:66388): avc:  denied  { read } for  pid=853340 comm="pasta.avx2" name="ns" dev="proc" ino=3930918 scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=0

   with an option like --netns /proc/853339/ns/net, where 853339 is the PID of
   crun, and tcontext is unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
   (note the unconfined_t type) because crun runs as unconfined now, see 5.

   We get further denials (reported by Robin): those are all related to
   accessing the network namespace directory at /proc/853339/ns/net.

7. as reported by Robin, same as reported by Paul at
   https://bugs.passt.top/show_bug.cgi?id=81#c0, we get further denials for
   setgid() and setuid(), which are actually harmless. Example in enforcing
   mode:

      type=AVC msg=audit(1734958422.982:66249): avc:  denied  { setgid } for  pid=840077 comm="passt.avx2" capability=6  scontext=unconfined_u:unconfined_r:passt_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:passt_t:s0-s0:c0.c1023 tclass=cap_userns permissive=0
      type=AVC msg=audit(1734958422.983:66250): avc:  denied  { setgid } for  pid=840077 comm="passt.avx2" capability=6  scontext=unconfined_u:unconfined_r:passt_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:passt_t:s0-s0:c0.c1023 tclass=cap_userns permissive=0
      type=AVC msg=audit(1734958422.983:66251): avc:  denied  { setuid } for  pid=840077 comm="passt.avx2" capability=7  scontext=unconfined_u:unconfined_r:passt_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:passt_t:s0-s0:c0.c1023 tclass=cap_userns permissive=0

   ...note that pasta happily continues after this. That's because, for
   simplicity, we call setgroups(), setgid(), and setuid() using the current UID
   and GID, in pasta's isolate_user():

      https://passt.top/passt/tree/isolation.c?id=e5ba8adef71ec53e192373ed1267dc338719dda0#n218

   ...we could skip those calls if we're already not running as root and no --runas
   option is given, but, POSIXly (1003.1-2024):

       The setgid() function shall fail if:

       [EINVAL]
           The value of the gid argument is invalid and is not supported by the implementation.
       [EPERM]
           The process does not have appropriate privileges and gid does not match the real group ID or the saved set-group-ID.

   note the "and gid does not match [...]" part. Same for setuid(). That's why
   we unconditionally call those. This is different for setgroups() (not POSIX),
   and that's why we explicitly handle EPERM for it.

So all in all I see three issues:

a. disabling the "unconfined" module shouldn't lead to processes started by
   Podman to run as unconfined (!) instead of container_runtime_t. This is the
   case for slirp4netns and crun (see observations 4. and 5. above).
   
   It also shouldn't lead to pasta to run as pasta_t instead of
   container_runtime_t.

   This is the actual issue in this ticket and, to solve it, we should ensure
   the transition to container_runtime_t even if the unconfined module is not
   loaded.

   I can send a patch (for container-selinux I suppose) if desired

b. if pasta runs as type pasta_t, it won't be able to open /proc/PID/ns/net,
   where PID is the PID of crun. This is already tracked at
   https://bugs.passt.top/show_bug.cgi?id=81.

   I can't fix this entirely in pasta's policy, because pasta's policy doesn't
   know where the network namespace reference to open is stored. I'll have to
   an interface that Podman could use to allow pasta to open the network
   namespace.

   I can prioritise fixing this, but probably we should fix a. anyway, and with
   a. fixed this is somewhat less important

c. SELinux logs denials for setuid() and setgid() operations which are not
   actually failing in their POSIX sense, and those are quite confusing.
   
   While we could simply drop the gratuitous setuid() and setgid() calls from
   pasta, I would argue that the proper fix is to partially revert (kernel)
   commit 5626d3e86141 ("selinux: remove hooks which simply defer to
   capabilities"), and modify the return value of the (old/new)
   selinux_task_setuid() and selinux_task_setgid() to also factor in the check
   performed by __is_setuid() / __is_setgid().

   I'll send a patch for this unless there are objections.

Comment 11 Paul Holzinger 2025-01-06 11:18:49 UTC
@lsm5 @dwalsh Thoughts on a) the container_runtime_t transition issue?

b) of course would be nice to fix eventually so we can run pasta more locked down

c) I don't have enough context for the kernel parts. But while it might be simpler in your code to call it, not doing so would safe two syscalls. Given it is only called once on pasta start-up it is most likely not noticeable though so it should not matter much.

Comment 12 Daniel Walsh 2025-01-06 20:10:38 UTC
I would advise SELinux-policy to add:


 $ cat mypol.te 

policy_module(mypol, 1.0)

require {
	type passt_t;
	type pasta_t;
}

#============= passt_t ==============
allow passt_t self:cap_userns { setgid setuid };

#============= pasta_t ==============
domain_getattr_all_entry_files(pasta_t)
allow pasta_t self:cap_userns { setgid setuid };
unconfined_list_dirs(pasta_t)

Comment 13 Stefano Brivio 2025-01-07 12:34:44 UTC
(In reply to Daniel Walsh from comment #12)
> I would advise SELinux-policy to add:
> 
>  $ cat mypol.te 
> 
> policy_module(mypol, 1.0)
> 
> require {
> 	type passt_t;
> 	type pasta_t;
> }
> 
> #============= passt_t ==============
> allow passt_t self:cap_userns { setgid setuid };
> 
> #============= pasta_t ==============
> domain_getattr_all_entry_files(pasta_t)
> allow pasta_t self:cap_userns { setgid setuid };
> unconfined_list_dirs(pasta_t)

Hmm, why? Note that passt and pasta ship their own SELinux policies.

As to setgid and setuid, see my point c. in comment #10: I don't think they should be granted.


Note You need to log in before you can comment on or make changes to this bug.