Bug 2230462
| Summary: | avc: denied { execmod } for pid=335139 comm="sh" path="/bin/sh" dev="dm-4" ino=138505864 scontext=system_u:system_r:spc_t:s0 tcontext=system_u:object_r:container_ro_file_t:s0 tclass=file permissive=0 | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Jeremy Poulin <jpoulin> | ||||
| Component: | container-selinux | Assignee: | Daniel Walsh <dwalsh> | ||||
| Status: | ASSIGNED --- | QA Contact: | atomic-bugs <atomic-bugs> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 9.2 | CC: | dwalsh, hmiyamot, jnovy, lmcfadde, lsm5, mboddu, mkumatag, mtarsel, sgokul, tsweeney | ||||
| Target Milestone: | rc | Flags: | tsweeney:
needinfo?
(dwalsh) |
||||
| Target Release: | --- | ||||||
| Hardware: | ppc64le | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | container-selinux-2.219.0-1.rhaos4.13.el9 | Doc Type: | If docs needed, set a value | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | Type: | Bug | |||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
Device labels are: sh-5.1# ls -Z | grep dm- system_u:object_r:fixed_disk_device_t:s0 dm-0 system_u:object_r:fixed_disk_device_t:s0 dm-1 system_u:object_r:fixed_disk_device_t:s0 dm-2 system_u:object_r:fixed_disk_device_t:s0 dm-3 system_u:object_r:fixed_disk_device_t:s0 dm-4 sh-5.1# This issue seems to have been introduced between the RHCOS releases: "io.openshift.build.versions": "machine-os=414.92.202307261347-0" "io.openshift.build.versions": "machine-os=414.92.202307270631-0" The specific updated packages in those builds where: container-selinux 3:2.208.0-2.rhaos4.13.el9 → 3:2.215.0-2.rhaos4.13.el9 cri-o 1.27.1-3.rhaos4.14.gited2afb7.el9 → 1.27.1-4.rhaos4.14.gitab7845e.el9 So I suspect this issue is in: 3:2.215.0-2.rhaos4.13.el9 This is allowed in container-selinux 2.219 $ audit2allow -i /tmp/t #============= spc_t ============== #!!!! This avc is allowed in the current policy allow spc_t container_ro_file_t:file execmod; apiv2 (mounts) $ rpm -q container-selinux container-selinux-2.219.0-1.fc38.noarch came up with minimal pod spec to reproduce this bug
1 oc create namespace debug-selinux
2 oc adm policy add-scc-to-group privileged system:serviceaccounts:debug-selinux
3 oc adm policy add-scc-to-group anyuid system:serviceaccounts:debug-selinux
4 oc adm policy add-scc-to-group hostmount-anyuid system:serviceaccounts:debug-selinux
5 apply the following yaml
apiVersion: v1
kind: Pod
metadata:
name: hostpath-pod
namespace: debug-selinux
spec:
# This will ensure to schedule the pod to the mentioned node to debug easily.
nodeName: rdr-cicd-mon01-414-6mk65-worker-c5vk4
containers:
- command:
- /bin/sh
- -c
- 'while true ; do sleep 2; done '
image: quay.io/openshift/community-e2e-images:e2e-7-registry-k8s-io-e2e-test-images-busybox-1-29-4-4zE9mRvED4RQoUxQ
imagePullPolicy: IfNotPresent
name: hostpath-pod
securityContext:
privileged: true
restartPolicy: Always
Pod logs:
% oc logs hostpath-pod -n debug-selinux
/bin/sh: error while loading shared libraries: cannot restore segment prot after reloc: Permission denied
Audit logs
$ ausearch -m AVC,USER_AVC,SELINUX_ERR,USER_SELINUX_ERR -ts recent -i
----
type=PROCTITLE msg=audit(08/10/23 10:36:01.111:2909) : proctitle=/bin/sh -c while true ; do sleep 2; done
type=SYSCALL msg=audit(08/10/23 10:36:01.111:2909) : arch=ppc64le syscall=mprotect success=no exit=EACCES(Permission denied) a0=0x13c060000 a1=0x130000 a2=PROT_READ|PROT_EXEC a3=0x113250 items=0 ppid=1125323 pid=1125335 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=sh exe=/bin/sh subj=system_u:system_r:spc_t:s0 key=(null)
type=AVC msg=audit(08/10/23 10:36:01.111:2909) : avc: denied { execmod } for pid=1125335 comm=sh path=/bin/sh dev="dm-4" ino=138505864 scontext=system_u:system_r:spc_t:s0 tcontext=system_u:object_r:container_ro_file_t:s0 tclass=file permissive=0
Observations:
- same pod is working fine in the x86 environment
- same pod works fine when use quay.io/centos/centos:stream9 image
Dan, can you please take a look at comment #5? After discussion with Jeremy I'm going to update container-selinux to 2.219.0 as Dan mentions. OCP 4.13 and up is now updated to container-selinux-2.219.0. Tested with patched rpm on the effected system and fix didn't work and still see the issue
----
type=PROCTITLE msg=audit(08/11/23 00:17:42.518:538) : proctitle=/bin/sh -c while true ; do sleep 2; done
type=SYSCALL msg=audit(08/11/23 00:17:42.518:538) : arch=ppc64le syscall=mprotect success=no exit=EACCES(Permission denied) a0=0x11ce10000 a1=0x130000 a2=PROT_READ|PROT_EXEC a3=0x113250 items=0 ppid=8942 pid=8954 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=sh exe=/bin/sh subj=system_u:system_r:spc_t:s0 key=(null)
type=AVC msg=audit(08/11/23 00:17:42.518:538) : avc: denied { execmod } for pid=8954 comm=sh path=/bin/sh dev="dm-4" ino=138505864 scontext=system_u:system_r:spc_t:s0 tcontext=system_u:object_r:container_ro_file_t:s0 tclass=file permissive=0
sh-5.1# date -u
Fri Aug 11 00:18:45 UTC 2023
sh-5.1# rpm -qa | grep container-selinux
container-selinux-2.219.0-1.rhaos4.13.el9.noarch
sh-5.1# rpm-ostree status
State: idle
Deployments:
* ostree-unverified-registry:quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a0f238c723c13d13f48231421e42b5063c92692e588a2f381572ffa33aca8d9c
Digest: sha256:a0f238c723c13d13f48231421e42b5063c92692e588a2f381572ffa33aca8d9c
Version: 414.92.202308080233-0 (2023-08-09T04:31:17Z)
LocalOverrides: container-selinux 3:2.215.0-2.rhaos4.13.el9 -> 3:2.219.0-1.rhaos4.13.el9
sh-5.1# rpm -qi container-selinux-2.219.0-1.rhaos4.13.el9.noarch
Name : container-selinux
Epoch : 3
Version : 2.219.0
Release : 1.rhaos4.13.el9
Architecture: noarch
Install Date: Fri Aug 11 00:08:38 2023
Group : Unspecified
Size : 68308
License : GPLv2
Signature : (none)
Source RPM : container-selinux-2.219.0-1.rhaos4.13.el9.src.rpm
Build Date : Thu Aug 10 16:41:23 2023
Build Host : x86-64-02.build.eng.rdu2.redhat.com
Packager : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>
Vendor : Red Hat, Inc.
URL : https://github.com/containers/container-selinux
Summary : SELinux policies for container runtimes
Description :
SELinux policy modules for use with container runtimes.
sh-5.1#
I installed the new container-selinux rpm on all of the worker nodes:
sh-5.1# rpm -qa | grep container-selin
container-selinux-2.221.0-1.el9.noarch
sh-5.1# rpm-ostree status
State: idle
Deployments:
* ostree-unverified-registry:quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a0f238c723c13d13f48231421e42b5063c92692e588a2f381572ffa33aca8d9c
Digest: sha256:a0f238c723c13d13f48231421e42b5063c92692e588a2f381572ffa33aca8d9c
Version: 414.92.202308080233-0 (2023-08-09T04:31:13Z)
LocalOverrides: container-selinux 3:2.215.0-2.rhaos4.13.el9 -> 3:2.221.0-1.el9
Hiro executed the single e2e test and he did not observe any errors from the command line. I went thru the journal logs and did not see any "avc: denied" or related selinux errors after the e2e single test run.
This cluster is still available with the newer container-selinux rpm installed for investigation.
Created attachment 1983452 [details]
openshift-test run-test output for previously failing testcase
This ran *after* all worker nodes were upgraded with `container-selinux-2.221.0-1.el9.noarch.rpm`.
Please see comments 13 & 14. As verification for the newest patch. |
Description of problem: Since the 27th of July (for OpenShift 4.14), the versions of OpenShift running on an RHCOS based on RHEL 9.2 have been hitting this permission denied error when scheduling a privileged container. Version-Release number of selected component (if applicable): How reproducible: 1. Deploy a nightly of OpenShift 4.12,4.13, or 4.14. 2. Run an openshift-test that schedules a privileged container. E.g.: $ KUBE_TEST_REPO_LIST="" KUBE_TEST_REPO="quay.io/openshift/community-e2e-images" ./openshift-tests run-test '[sig-storage] In-tree Volumes [Driver: hostPath] [Testpattern: Inline-volume (default fs)] volumes should store data [Suite:openshift/conformance/parallel] [Suite:k8s]' 3. Monitor the journal logs on the worker for selinux errors Actual results: type=AVC msg=audit(1691587536.149:914): avc: denied { execmod } for pid=335139 comm="sh" path="/bin/sh" dev="dm-4" ino=138505864 scontext=system_u:system_r:spc_t:s0 tcontext=system_u:object_r:container_ro_file_t:s0 tclass=file permissive=0 Expected results: No avc errors.