Bug 1839065
Summary: | Privileged containers should be unconfined_t | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Olimp Bockowski <obockows> | ||||
Component: | container-selinux | Assignee: | Jindrich Novy <jnovy> | ||||
Status: | CLOSED ERRATA | QA Contact: | atomic-bugs <atomic-bugs> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | low | ||||||
Version: | 8.3 | CC: | bbaude, bbreard, cglombek, dornelas, dwalsh, imcleod, jligon, jnovy, kanderso, kchamart, lsm5, mheon, nijoshi, nstielau, travier, tsweeney, walters, ypu | ||||
Target Milestone: | rc | ||||||
Target Release: | 8.0 | ||||||
Hardware: | All | ||||||
OS: | All | ||||||
Whiteboard: | |||||||
Fixed In Version: | container-selinux-2.135.0-1.el8 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2020-07-21 15:32:03 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1186913, 1793607 | ||||||
Attachments: |
|
Description
Olimp Bockowski
2020-05-22 12:28:13 UTC
Created attachment 1691053 [details]
strace with timeout
DBUS daemon fails to forward a reply message from systemd to hostnamectl, this would explain why hostnamectl times out Setting as low priority as this doesn't appear to impede functionality of the cluster. Is this really expected to work? Either way it seems like a SELinux issue, unrelated to RHCOS. I can reproduce it on RHEL 8: # podman run --rm -ti --privileged -v /:/host registry.access.redhat.com/ubi8 [root@70b2ffa3f3a7 /]# chroot /host/ sh-4.4# time hostnamectl Failed to query system properties: Connection timed out real 0m25.021s user 0m0.004s sys 0m0.006s Yes, it's a base podman/container-selinux problem. I basically don't understand the rationale behind spc_t. Anything that's spc_t can easily "escape" to unconfined_t by e.g. `systemd-run` or destroy the system via rm -rf /host or whatever. We just keep hitting random "unconfined_t can do it but spc_t can't" problems that impede us managing and debugging the operating system via containers. I tried this but something seems to be denying it: ``` $ podman run --privileged --security-opt=label=type:unconfined_t --rm -ti registry.access.redhat.com/ubi8/ubi:latest {"msg":"exec container process `/bin/bash`: Permission denied","level":"error","time":"2020-05-29T17:03:20.000206344Z"} ``` Got a bit farther with this: ``` podman run --privileged --security-opt=label=user:unconfined_u,role:unconfined_r,type:unconfined_t --rm -ti registry.access.redhat.com/ubi8/ubi:latest Error: failed to mount shm tmpfs "/var/lib/containers/storage/overlay-containers/b4fa6193d72291cf30de73d639c8ecaa6af5367ad3d359bb69e0aa72fc253dc3/userdata/shm": invalid argument ``` And the real problem there seems to be: ``` [251727.698188] SELinux: security_context_str_to_sid(unconfined_u,role:unconfined_r,type:unconfined_t:object_r:container_file_t:s0:c32,c492) failed for (dev tmpfs, type tmpfs) errno=-22 ``` So I think the RFE here is: add an option to the container stack that lets us be unconfined_t - or stated more generally, have a container image act as closely as possible as if it's run from a root shell. (This is a bit of a debatable topic whether it should be unconfined_t or init_t; we might need both) Another case for example is https://github.com/openshift/cluster-network-operator/pull/477 where we were actively fighting spc_t because we want a container image to manage openvswitch on the host, and it turned into a mess of spc_t and openvswitch_t, whereas it's completely fine to manage openvswitch via the usual unconfined_t. Fixed in https://github.com/containers/container-selinux/releases/tag/v2.135.0 This allows you to specify # podman run -ti --security-opt label=type:unconfined_t fedora cat /proc/self/attr/current Trying to pull registry.fedoraproject.org/fedora... Getting image source signatures Copying blob 1657ffead824 [--------------------------------------] 0.0b / 0.0b Copying config eb7134a03c done Writing manifest to image destination Storing signatures system_u:system_r:unconfined_t:s0:c290,c962 Or to define it in the kubernetes yaml. Awesome, thank you! Can we get this backported to 8.3? Assigning to Jindrich for packaging needs. Jindrich can you also answer Colin's question in https://bugzilla.redhat.com/show_bug.cgi?id=1839065#c8 please? Yes we should be able to get this on 8.3, I have no problem on shipping it in 8.2.1 as well. (In reply to Colin Walters from comment #8) > Awesome, thank you! Can we get this backported to 8.3? Yes, it's now in 8.3.0 as well. Can we get QA ack on this one please? Dan, can you please answer the three questions in comment #13 ? It is now required for 8.2.1 exception+. Thanks. I think we could probably live with having this just in 8.3. Test with podman-1.9.3-1.module+el8.2.1+6750+e53a300c.x86_64 and container-selinux-2.135.0-1.module+el8.2.1+6849+893e4f4a.noarch with --security-opt label=type:unconfined_t now it works as expect. So set this to verified. Details: # podman run --rm -ti --privileged --security-opt label=type:unconfined_t -v /:/host registry.access.redhat.com/ubi8 [root@83ef1574f3c2 /]# chroot /host sh-4.4# time hostnamectl Static hostname: ibm-x3250m6-05.lab.eng.pek2.redhat.com Icon name: computer-server Chassis: server Machine ID: 28183d0c95ce4e2da2c950a11992c7e1 Boot ID: e2b076d01d2c49fb946142f3a9aab571 Operating System: Red Hat Enterprise Linux 8.2 (Ootpa) CPE OS Name: cpe:/o:redhat:enterprise_linux:8.2:GA Kernel: Linux 4.18.0-193.1.2.el8_2.x86_64 Architecture: x86-64 real 0m0.270s user 0m0.005s sys 0m0.000s sh-4.4# exit exit [root@83ef1574f3c2 /]# exit exit In the meantime, we're hitting issues like this: type=AVC msg=audit(1594130128.905:155): avc: denied { write } for pid=523559 comm="test2" name="yum.repos.d" dev="dm-0" ino=220215514 scontext=system_u: system_r:init_t:s0 tcontext=system_u:object_r:system_conf_t:s0 tclass=dir permissive=0` Where in order to "escape" spc_t we're using systemd-run to run code as init_t, but that domain itself is denied to do some things. Ultimately we need a fully privileged domain capable of *all* systems management - that is unconfined_t and not e.g. init_t right? Compare with how e.g. Ansible works by logging in over ssh - that code hence runs as unconfined_t, which means it works differently than doing systems management via a systemd unit or a container. If with this we're agreed to use unconfined_t, let's be sure that we have well defined mechanisms to transition to that domain from the other cases (e.g. a systemd unit). - Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:3053 *** Bug 1896369 has been marked as a duplicate of this bug. *** The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |