Bug 2230462

Summary: avc: denied { execmod } for pid=335139 comm="sh" path="/bin/sh" dev="dm-4" ino=138505864 scontext=system_u:system_r:spc_t:s0 tcontext=system_u:object_r:container_ro_file_t:s0 tclass=file permissive=0
Product: Red Hat Enterprise Linux 9 Reporter: Jeremy Poulin <jpoulin>
Component: container-selinuxAssignee: Daniel Walsh <dwalsh>
Status: ASSIGNED --- QA Contact: atomic-bugs <atomic-bugs>
Severity: high Docs Contact:
Priority: unspecified    
Version: 9.2CC: dwalsh, hmiyamot, jnovy, lmcfadde, lsm5, mboddu, mkumatag, mtarsel, sgokul, tsweeney
Target Milestone: rcFlags: tsweeney: needinfo? (dwalsh)
Target Release: ---   
Hardware: ppc64le   
OS: Linux   
Whiteboard:
Fixed In Version: container-selinux-2.219.0-1.rhaos4.13.el9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
openshift-test run-test output for previously failing testcase none

Description Jeremy Poulin 2023-08-09 13:57:42 UTC
Description of problem:
Since the 27th of July (for OpenShift 4.14), the versions of OpenShift running on an RHCOS based on RHEL 9.2 have been hitting this permission denied error when scheduling a privileged container.

Version-Release number of selected component (if applicable):


How reproducible:
1. Deploy a nightly of OpenShift 4.12,4.13, or 4.14.
2. Run an openshift-test that schedules a privileged container. E.g.:
$ KUBE_TEST_REPO_LIST="" KUBE_TEST_REPO="quay.io/openshift/community-e2e-images" ./openshift-tests run-test '[sig-storage] In-tree Volumes [Driver: hostPath] [Testpattern: Inline-volume (default fs)] volumes should store data [Suite:openshift/conformance/parallel] [Suite:k8s]'
3. Monitor the journal logs on the worker for selinux errors


Actual results:
type=AVC msg=audit(1691587536.149:914): avc:  denied  { execmod } for  pid=335139 comm="sh" path="/bin/sh" dev="dm-4" ino=138505864 scontext=system_u:system_r:spc_t:s0 tcontext=system_u:object_r:container_ro_file_t:s0 tclass=file permissive=0


Expected results:
No avc errors.

Comment 1 Jeremy Poulin 2023-08-09 14:05:33 UTC
Device labels are:

sh-5.1# ls -Z | grep dm-
  system_u:object_r:fixed_disk_device_t:s0 dm-0
  system_u:object_r:fixed_disk_device_t:s0 dm-1
  system_u:object_r:fixed_disk_device_t:s0 dm-2
  system_u:object_r:fixed_disk_device_t:s0 dm-3
  system_u:object_r:fixed_disk_device_t:s0 dm-4
sh-5.1#

Comment 2 Jeremy Poulin 2023-08-09 17:45:15 UTC
This issue seems to have been introduced between the RHCOS releases:
"io.openshift.build.versions": "machine-os=414.92.202307261347-0"
"io.openshift.build.versions": "machine-os=414.92.202307270631-0"

Comment 3 Jeremy Poulin 2023-08-09 18:14:19 UTC
The specific updated packages in those builds where:
container-selinux	3:2.208.0-2.rhaos4.13.el9 → 3:2.215.0-2.rhaos4.13.el9
cri-o	1.27.1-3.rhaos4.14.gited2afb7.el9 → 1.27.1-4.rhaos4.14.gitab7845e.el9


So I suspect this issue is in: 3:2.215.0-2.rhaos4.13.el9

Comment 4 Daniel Walsh 2023-08-09 20:01:19 UTC
This is allowed in container-selinux 2.219

$ audit2allow  -i /tmp/t


#============= spc_t ==============

#!!!! This avc is allowed in the current policy
allow spc_t container_ro_file_t:file execmod;
apiv2 (mounts) $ rpm -q container-selinux
container-selinux-2.219.0-1.fc38.noarch

Comment 5 mkumatag 2023-08-10 10:39:30 UTC
came up with minimal pod spec to reproduce this bug

1 oc create namespace debug-selinux
2 oc adm policy add-scc-to-group privileged system:serviceaccounts:debug-selinux
3 oc adm policy add-scc-to-group anyuid system:serviceaccounts:debug-selinux
4 oc adm policy add-scc-to-group hostmount-anyuid system:serviceaccounts:debug-selinux
5 apply the following yaml

apiVersion: v1
kind: Pod
metadata:
  name: hostpath-pod
  namespace: debug-selinux
spec:
  # This will ensure to schedule the pod to the mentioned node to debug easily.
  nodeName: rdr-cicd-mon01-414-6mk65-worker-c5vk4
  containers:
  - command:
    - /bin/sh
    - -c
    - 'while true ; do sleep 2; done '
    image: quay.io/openshift/community-e2e-images:e2e-7-registry-k8s-io-e2e-test-images-busybox-1-29-4-4zE9mRvED4RQoUxQ
    imagePullPolicy: IfNotPresent
    name: hostpath-pod
    securityContext:
      privileged: true
  restartPolicy: Always


Pod logs:

% oc logs hostpath-pod -n debug-selinux
/bin/sh: error while loading shared libraries: cannot restore segment prot after reloc: Permission denied


Audit logs

$ ausearch -m AVC,USER_AVC,SELINUX_ERR,USER_SELINUX_ERR -ts recent -i
----
type=PROCTITLE msg=audit(08/10/23 10:36:01.111:2909) : proctitle=/bin/sh -c while true ; do sleep 2; done
type=SYSCALL msg=audit(08/10/23 10:36:01.111:2909) : arch=ppc64le syscall=mprotect success=no exit=EACCES(Permission denied) a0=0x13c060000 a1=0x130000 a2=PROT_READ|PROT_EXEC a3=0x113250 items=0 ppid=1125323 pid=1125335 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=sh exe=/bin/sh subj=system_u:system_r:spc_t:s0 key=(null)
type=AVC msg=audit(08/10/23 10:36:01.111:2909) : avc:  denied  { execmod } for  pid=1125335 comm=sh path=/bin/sh dev="dm-4" ino=138505864 scontext=system_u:system_r:spc_t:s0 tcontext=system_u:object_r:container_ro_file_t:s0 tclass=file permissive=0


Observations:
- same pod is working fine in the x86 environment
- same pod works fine when use quay.io/centos/centos:stream9 image

Comment 6 Jindrich Novy 2023-08-10 16:15:21 UTC
Dan, can you please take a look at comment #5?

Comment 7 Jindrich Novy 2023-08-10 16:24:37 UTC
After discussion with Jeremy I'm going to update container-selinux to 2.219.0 as Dan mentions.

Comment 8 Jindrich Novy 2023-08-10 16:45:48 UTC
OCP 4.13 and up is now updated to container-selinux-2.219.0.

Comment 9 mkumatag 2023-08-11 00:29:40 UTC
Tested with patched rpm on the effected system and fix didn't work and still see the issue

----
type=PROCTITLE msg=audit(08/11/23 00:17:42.518:538) : proctitle=/bin/sh -c while true ; do sleep 2; done
type=SYSCALL msg=audit(08/11/23 00:17:42.518:538) : arch=ppc64le syscall=mprotect success=no exit=EACCES(Permission denied) a0=0x11ce10000 a1=0x130000 a2=PROT_READ|PROT_EXEC a3=0x113250 items=0 ppid=8942 pid=8954 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=sh exe=/bin/sh subj=system_u:system_r:spc_t:s0 key=(null)
type=AVC msg=audit(08/11/23 00:17:42.518:538) : avc:  denied  { execmod } for  pid=8954 comm=sh path=/bin/sh dev="dm-4" ino=138505864 scontext=system_u:system_r:spc_t:s0 tcontext=system_u:object_r:container_ro_file_t:s0 tclass=file permissive=0
sh-5.1# date -u
Fri Aug 11 00:18:45 UTC 2023


sh-5.1# rpm -qa | grep container-selinux
container-selinux-2.219.0-1.rhaos4.13.el9.noarch


sh-5.1# rpm-ostree status
State: idle
Deployments:
* ostree-unverified-registry:quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a0f238c723c13d13f48231421e42b5063c92692e588a2f381572ffa33aca8d9c
                   Digest: sha256:a0f238c723c13d13f48231421e42b5063c92692e588a2f381572ffa33aca8d9c
                  Version: 414.92.202308080233-0 (2023-08-09T04:31:17Z)
           LocalOverrides: container-selinux 3:2.215.0-2.rhaos4.13.el9 -> 3:2.219.0-1.rhaos4.13.el9

sh-5.1# rpm -qi container-selinux-2.219.0-1.rhaos4.13.el9.noarch
Name        : container-selinux
Epoch       : 3
Version     : 2.219.0
Release     : 1.rhaos4.13.el9
Architecture: noarch
Install Date: Fri Aug 11 00:08:38 2023
Group       : Unspecified
Size        : 68308
License     : GPLv2
Signature   : (none)
Source RPM  : container-selinux-2.219.0-1.rhaos4.13.el9.src.rpm
Build Date  : Thu Aug 10 16:41:23 2023
Build Host  : x86-64-02.build.eng.rdu2.redhat.com
Packager    : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>
Vendor      : Red Hat, Inc.
URL         : https://github.com/containers/container-selinux
Summary     : SELinux policies for container runtimes
Description :
SELinux policy modules for use with container runtimes.
sh-5.1#

Comment 13 Mick Tarsel 2023-08-15 21:01:19 UTC
I installed the new container-selinux rpm on all of the worker nodes:

sh-5.1# rpm -qa | grep container-selin
container-selinux-2.221.0-1.el9.noarch

sh-5.1#  rpm-ostree status
State: idle
Deployments:
* ostree-unverified-registry:quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a0f238c723c13d13f48231421e42b5063c92692e588a2f381572ffa33aca8d9c
                   Digest: sha256:a0f238c723c13d13f48231421e42b5063c92692e588a2f381572ffa33aca8d9c
                  Version: 414.92.202308080233-0 (2023-08-09T04:31:13Z)
           LocalOverrides: container-selinux 3:2.215.0-2.rhaos4.13.el9 -> 3:2.221.0-1.el9

Hiro executed the single e2e test and he did not observe any errors from the command line. I went thru the journal logs and did not see any "avc: denied" or related selinux errors after the e2e single test run.

This cluster is still available with the newer container-selinux rpm installed for investigation.

Comment 14 Hiro Miyamoto 2023-08-15 21:18:08 UTC
Created attachment 1983452 [details]
openshift-test run-test output for previously failing testcase

This ran *after* all worker nodes were upgraded with `container-selinux-2.221.0-1.el9.noarch.rpm`.

Comment 15 Jeremy Poulin 2023-08-16 14:24:51 UTC
Please see comments 13 & 14. As verification for the newest patch.