Bug 1973418

Summary: kubelet service fail to load EnvironmentFile due to SELinux denial (Re-opened)
Product: Red Hat Enterprise Linux 8 Reporter: Michael Nguyen <mnguyen>
Component: container-selinuxAssignee: Jindrich Novy <jnovy>
Status: CLOSED ERRATA QA Contact: Edward Shen <weshen>
Severity: unspecified Docs Contact:
Priority: urgent    
Version: 8.4CC: Alexandros.Phinikarides, dornelas, dwalsh, jnovy, lvrabec, miabbott, mmalik, ssekidde, tsweeney, ypu
Target Milestone: betaKeywords: ZStream
Target Release: ---Flags: pm-rhel: mirror+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: container-selinux-2.165.1-2.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1999245 2005018 (view as bug list) Environment:
Last Closed: 2021-11-09 17:38:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1186913, 1969998, 1999245, 2005018    

Description Michael Nguyen 2021-06-17 19:06:05 UTC
Description of problem:
Basically re-opening BZ1960769 (already closed) which does not seem to have fixed the problem in RHCOS.


Version-Release number of selected component (if applicable):
container-selinux-2.162.0-1.module+el8.4.0+11311+9da8acfb.noarch

How reproducible:
Always

Steps to Reproduce:
1. In RHCOS, create a environment file in /etc/kubernetes 
2. reference that EnvironmentFile in the kubelet.service file
3. systemctl daemon-reload
4. restart kubelet

Actual results:
avc denial reading the EnvironmentFile.

Expected results:
No denials, service restarts with no issue.

Additional info:
$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.0-0.nightly-2021-06-16-190035   True        False         3h46m   Cluster version is 4.8.0-0.nightly-2021-06-16-190035

$ oc get nodes
NAME                                         STATUS   ROLES    AGE     VERSION
ip-10-0-134-236.us-west-2.compute.internal   Ready    master   4h12m   v1.21.0-rc.0+120883f
ip-10-0-150-206.us-west-2.compute.internal   Ready    worker   4h7m    v1.21.0-rc.0+120883f
ip-10-0-164-27.us-west-2.compute.internal    Ready    master   4h12m   v1.21.0-rc.0+120883f
ip-10-0-183-87.us-west-2.compute.internal    Ready    worker   4h5m    v1.21.0-rc.0+120883f
ip-10-0-210-154.us-west-2.compute.internal   Ready    master   4h13m   v1.21.0-rc.0+120883f
ip-10-0-222-34.us-west-2.compute.internal    Ready    worker   4h6m    v1.21.0-rc.0+120883f

$ oc debug node/ip-10-0-183-87.us-west-2.compute.internal 
Starting pod/ip-10-0-183-87us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# cd /etc/kubernetes/
sh-4.4# ls
ca.crt	    cni		kubelet-ca.crt	 kubelet.conf  static-pod-resources
cloud.conf  kubeconfig	kubelet-plugins  manifests
sh-4.4# echo TEST=TEST > test-env
sh-4.4# ls -laZ
total 40
drwxr-xr-x.  6 root root system_u:object_r:kubernetes_file_t:s0  193 Jun 17 17:59 .
drwxr-xr-x. 96 root root system_u:object_r:etc_t:s0             8192 Jun 17 17:25 ..
-rw-r--r--.  1 root root system_u:object_r:kubernetes_file_t:s0 1123 Jun 17 17:25 ca.crt
-rw-r--r--.  1 root root system_u:object_r:kubernetes_file_t:s0    0 Jun 17 17:25 cloud.conf
drwxr-xr-x.  3 root root system_u:object_r:kubernetes_file_t:s0   19 Jun 17 13:48 cni
-rw-r--r--.  1 root root system_u:object_r:kubernetes_file_t:s0 6050 Jun 17 13:46 kubeconfig
-rw-r--r--.  1 root root system_u:object_r:kubernetes_file_t:s0 5875 Jun 17 17:25 kubelet-ca.crt
drwxr-xr-x.  3 root root system_u:object_r:kubernetes_file_t:s0   20 Jun 17 13:48 kubelet-plugins
-rw-r--r--.  1 root root system_u:object_r:kubernetes_file_t:s0 1076 Jun 17 17:25 kubelet.conf
drwxr-xr-x.  2 root root system_u:object_r:kubernetes_file_t:s0    6 Jun 17 13:49 manifests
drwxr-xr-x.  3 root root system_u:object_r:kubernetes_file_t:s0   24 Jun 17 13:48 static-pod-resources
-rw-r--r--.  1 root root system_u:object_r:kubernetes_file_t:s0   10 Jun 17 17:59 test-env
sh-4.4# vi /etc/systemd/system/kubelet.service
sh-4.4# audit2allow -a
sh-4.4# cat /etc/systemd/system/kubelet.service
[Unit]
Description=Kubernetes Kubelet
Wants=rpc-statd.service network-online.target
Requires=crio.service kubelet-auto-node-size.service
After=network-online.target crio.service kubelet-auto-node-size.service
After=ostree-finalize-staged.service

[Service]
Type=notify
ExecStartPre=/bin/mkdir --parents /etc/kubernetes/manifests
ExecStartPre=/bin/rm -f /var/lib/kubelet/cpu_manager_state
EnvironmentFile=/etc/os-release
EnvironmentFile=-/etc/kubernetes/kubelet-workaround
EnvironmentFile=-/etc/kubernetes/kubelet-env
EnvironmentFile=/etc/kubernetes/test-env
EnvironmentFile=/etc/node-sizing.env

ExecStart=/usr/bin/hyperkube \
    kubelet \
      --config=/etc/kubernetes/kubelet.conf \
      --bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
      --kubeconfig=/var/lib/kubelet/kubeconfig \
      --container-runtime=remote \
      --container-runtime-endpoint=/var/run/crio/crio.sock \
      --runtime-cgroups=/system.slice/crio.service \
      --node-labels=node-role.kubernetes.io/worker,node.openshift.io/os_id=${ID} \
      --node-ip=${KUBELET_NODE_IP} \
      --minimum-container-ttl-duration=6m0s \
      --volume-plugin-dir=/etc/kubernetes/kubelet-plugins/volume/exec \
      --cloud-provider=aws \
       \
      --pod-infra-container-image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:fa0f2cad0e8d907a10bf91b2fe234659495a694235a9e2ef7015eb450ce9f1ba \
      --system-reserved=cpu=${SYSTEM_RESERVED_CPU},memory=${SYSTEM_RESERVED_MEMORY} \
      --v=${KUBELET_LOG_LEVEL}

Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target
sh-4.4# systemctl daemon-reload
sh-4.4# systemctl status kubelet
● kubelet.service - Kubernetes Kubelet
   Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
           └─10-mco-default-madv.conf, 20-logging.conf
   Active: active (running) since Thu 2021-06-17 17:25:38 UTC; 35min ago
 Main PID: 1400 (kubelet)
    Tasks: 16 (limit: 48468)
   Memory: 201.0M
      CPU: 2min 27.198s
   CGroup: /system.slice/kubelet.service
           └─1400 kubelet --config=/etc/kubernetes/kubelet.conf --bootstrap-kubeconfig=/etc/kubernetes/ku>

Jun 17 18:01:04 ip-10-0-183-87 hyperkube[1400]: I0617 18:01:04.109653    1400 scope.go:111] "RemoveContai>
Jun 17 18:01:04 ip-10-0-183-87 hyperkube[1400]: E0617 18:01:04.109979    1400 remote_runtime.go:334] "Con>
Jun 17 18:01:04 ip-10-0-183-87 hyperkube[1400]: I0617 18:01:04.110010    1400 pod_container_deletor.go:52>
Jun 17 18:01:04 ip-10-0-183-87 hyperkube[1400]: I0617 18:01:04.198000    1400 reconciler.go:196] "operati>
Jun 17 18:01:04 ip-10-0-183-87 hyperkube[1400]: I0617 18:01:04.203632    1400 operation_generator.go:829]>
Jun 17 18:01:04 ip-10-0-183-87 hyperkube[1400]: I0617 18:01:04.298840    1400 reconciler.go:319] "Volume >
Jun 17 18:01:05 ip-10-0-183-87 hyperkube[1400]: I0617 18:01:05.110755    1400 kubelet.go:1960] "SyncLoop >
Jun 17 18:01:05 ip-10-0-183-87 hyperkube[1400]: I0617 18:01:05.116113    1400 kubelet.go:1954] "SyncLoop >
Jun 17 18:01:05 ip-10-0-183-87 hyperkube[1400]: I0617 18:01:05.116180    1400 kubelet.go:2153] "Failed to>
Jun 17 18:01:05 ip-10-0-183-87 hyperkube[1400]: I0617 18:01:05.116181    1400 reflector.go:225] Stopping >
sh-4.4# systemctl restart kubelet

Removing debug pod ...


== restarting the kubelet removes the debug node and I need to SSH back in ==

$ ./ssh.sh ip-10-0-183-87.us-west-2.compute.internal
Warning: Permanently added 'ip-10-0-183-87.us-west-2.compute.internal' (ECDSA) to the list of known hosts.

[root@ip-10-0-183-87 kubernetes]# systemctl status kubelet
● kubelet.service - Kubernetes Kubelet
   Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
           └─10-mco-default-madv.conf, 20-logging.conf
   Active: inactive (dead) (Result: resources) since Thu 2021-06-17 18:01:34 UTC; 3ms ago
  Process: 1400 ExecStart=/usr/bin/hyperkube kubelet --config=/etc/kubernetes/kubelet.conf --bootstrap>
 Main PID: 1400 (code=exited, status=0/SUCCESS)
      CPU: 0

Jun 17 18:01:24 ip-10-0-183-87 systemd[1]: kubelet.service: Failed to load environment files: Permissi>
Jun 17 18:01:24 ip-10-0-183-87 systemd[1]: kubelet.service: Failed to run 'start-pre' task: Permission>
Jun 17 18:01:24 ip-10-0-183-87 systemd[1]: kubelet.service: Failed with result 'resources'.
Jun 17 18:01:24 ip-10-0-183-87 systemd[1]: Failed to start Kubernetes Kubelet.
Jun 17 18:01:34 ip-10-0-183-87 systemd[1]: kubelet.service: Service RestartSec=10s expired, scheduling>
Jun 17 18:01:34 ip-10-0-183-87 systemd[1]: kubelet.service: Scheduled restart job, restart counter is >
Jun 17 18:01:34 ip-10-0-183-87 systemd[1]: Stopped Kubernetes Kubelet.
Jun 17 18:01:34 ip-10-0-183-87 systemd[1]: kubelet.service: Consumed 0 CPU time
lines 1-17/17 (END)

[root@ip-10-0-183-87 kubernetes]# audit2allow -a


#============= init_t ==============
allow init_t kubernetes_file_t:file read;
[root@ip-10-0-183-87 kubernetes]# grep avc /var/log/audit/audit.log | tail -1 - | audit2why
type=AVC msg=audit(1623953918.958:1790): avc:  denied  { read } for  pid=1 comm="systemd" name="test-env" dev="nvme0n1p4" ino=92295647 scontext=system_u:system_r:init_t:s0 tcontext=system_u:object_r:kubernetes_file_t:s0 tclass=file permissive=0

	Was caused by:
		Missing type enforcement (TE) allow rule.

		You can use audit2allow to generate a loadable module to allow this access.

[root@ip-10-0-183-87 kubernetes]# rpm -q container-selinux
container-selinux-2.162.0-1.module+el8.4.0+11311+9da8acfb.noarch
[root@ip-10-0-183-87 kubernetes]# rpm-ostree status
State: idle
Deployments:
● pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a9af00365ac9b38d479a6e33bb54fbc3f1150d7903873b9a6f1230a2a7577622
              CustomOrigin: Managed by machine-config-operator
                   Version: 48.84.202106141119-0 (2021-06-14T11:22:36Z)

  ostree://457db8ff03dda5b3ce1a8e242fd91ddbe6a82f838d1b0047c3d4aeaf6c53f572
                   Version: 48.84.202106091622-0 (2021-06-09T16:25:42Z)

Comment 2 Micah Abbott 2021-06-17 19:37:14 UTC
Simple reproducer:

- create a simple file at /etc/kubernetes/test

```
# cat /etc/kubernetes/test
TEST=foobar
```

- create a simple systemd service with an EnvironmentFile

```
$ systemctl cat echo.service 
# /etc/systemd/system/echo.service
[Unit]
Description=An echo unit
[Service]
Type=oneshot
RemainAfterExit=yes
EnvironmentFile=/etc/kubernetes/kubelet-pause-image-override
ExecStart=/usr/bin/echo ${PAUSE}
[Install]
WantedBy=multi-user.target
```

- systemd daemon-reload && systemctl start echo.service

```
$ sudo systemctl daemon-reload && sudo systemctl start echo.service
Job for echo.service failed because of unavailable resources or another system error.
See "systemctl status echo.service" and "journalctl -xe" for details.

$ systemctl status echo.service 
● echo.service - An echo unit
   Loaded: loaded (/etc/systemd/system/echo.service; enabled; vendor preset: enabled)
   Active: failed (Result: resources)

Jun 17 19:29:43 localhost systemd[1]: echo.service: Failed to load environment files: Permission denied
Jun 17 19:29:43 localhost systemd[1]: echo.service: Failed to run 'start' task: Permission denied
Jun 17 19:29:43 localhost systemd[1]: echo.service: Failed with result 'resources'.
Jun 17 19:29:43 localhost systemd[1]: Failed to start An echo unit.
```

```
$ sudo audit2allow -a


#============= init_t ==============
allow init_t kubernetes_file_t:file read;
```

Comment 3 Micah Abbott 2021-06-17 19:40:32 UTC
(In reply to Micah Abbott from comment #2)
> Simple reproducer:
> 
> - create a simple file at /etc/kubernetes/test
> 
> ```
> # cat /etc/kubernetes/test
> TEST=foobar
> ```
> 
> - create a simple systemd service with an EnvironmentFile
> 
> ```
> $ systemctl cat echo.service 
> # /etc/systemd/system/echo.service
> [Unit]
> Description=An echo unit
> [Service]
> Type=oneshot
> RemainAfterExit=yes
> EnvironmentFile=/etc/kubernetes/kubelet-pause-image-override

Argh, this line should read:

EnvironmentFile=/etc/kubernetes/test


> ExecStart=/usr/bin/echo ${PAUSE}
> [Install]
> WantedBy=multi-user.target
> ```
> 
> - systemd daemon-reload && systemctl start echo.service
> 
> ```
> $ sudo systemctl daemon-reload && sudo systemctl start echo.service
> Job for echo.service failed because of unavailable resources or another
> system error.
> See "systemctl status echo.service" and "journalctl -xe" for details.
> 
> $ systemctl status echo.service 
> ● echo.service - An echo unit
>    Loaded: loaded (/etc/systemd/system/echo.service; enabled; vendor preset:
> enabled)
>    Active: failed (Result: resources)
> 
> Jun 17 19:29:43 localhost systemd[1]: echo.service: Failed to load
> environment files: Permission denied
> Jun 17 19:29:43 localhost systemd[1]: echo.service: Failed to run 'start'
> task: Permission denied
> Jun 17 19:29:43 localhost systemd[1]: echo.service: Failed with result
> 'resources'.
> Jun 17 19:29:43 localhost systemd[1]: Failed to start An echo unit.
> ```
> 
> ```
> $ sudo audit2allow -a
> 
> 
> #============= init_t ==============
> allow init_t kubernetes_file_t:file read;
> ```

Comment 4 Daniel Walsh 2021-08-24 16:16:33 UTC
This is an SELinux policy bug not a container-selinux bug.

Comment 5 Derrick Ornelas 2021-08-24 17:08:38 UTC
(In reply to Daniel Walsh from comment #4)
> This is an SELinux policy bug not a container-selinux bug.

Does there need to be some change in selinux-policy in addition to what you added to container-selinux in https://github.com/containers/container-selinux/commit/da2828824807d859cee1ac96e1d39c1abd4397da ?

Comment 6 Daniel Walsh 2021-08-24 18:28:31 UTC
Fixed in container-selinux-2.165.0

Comment 12 Micah Abbott 2021-08-26 15:56:49 UTC
FWIW, I built a custom RHCOS 4.9 image using `container-selinux-2.165.1-2.module+el8.5.0+12381+e822eb26.noarch` and confirmed that the issue was fixed using the reproducer:
`

```
[core@cosa-devsh ~]$ rpm-ostree status
State: idle
Deployments:
* ostree://b92a5782851fe87a9e7b4b5647a8bbb571957599609b5a73aea6623a9dcf9576
                   Version: 49.84.202108261523-0 (2021-08-26T15:25:49Z)
[core@cosa-devsh ~]$ rpm -q container-selinux
container-selinux-2.165.1-2.module+el8.5.0+12381+e822eb26.noarch
[core@cosa-devsh ~]$ systemctl status echo.service
● echo.service - An echo unit
   Loaded: loaded (/etc/systemd/system/echo.service; enabled; vendor preset: enabled)
   Active: active (exited) since Thu 2021-08-26 15:55:05 UTC; 1min 4s ago
  Process: 1360 ExecStart=/usr/bin/echo ${PAUSE} (code=exited, status=0/SUCCESS)
 Main PID: 1360 (code=exited, status=0/SUCCESS)
    Tasks: 0 (limit: 5610)
   Memory: 0B
   CGroup: /system.slice/echo.service

Aug 26 15:55:05 localhost systemd[1]: Starting An echo unit...
Aug 26 15:55:05 localhost echo[1360]: registry.fedoraproject.org/fedora:34
Aug 26 15:55:05 localhost systemd[1]: Started An echo unit.
[core@cosa-devsh ~]$ sudo ausearch -m avc
<no matches>
```

Comment 28 errata-xmlrpc 2021-11-09 17:38:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: container-tools:rhel8 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:4154