Bug 1517212

Summary:	[DOCKER] `oc rsh` failed with "command terminated with exit code 129" when docker system container enabled
Product:	OpenShift Container Platform	Reporter:	Johnny Liu <jialiu>
Component:	Containers	Assignee:	Giuseppe Scrivano <gscrivan>
Status:	CLOSED CURRENTRELEASE	QA Contact:	Johnny Liu <jialiu>
Severity:	medium	Docs Contact:
Priority:	high
Version:	3.7.0	CC:	amurdaca, aos-bugs, ddarrah, dwalsh, gscrivan, jhonce, jialiu, jligon, jokerman, lfriedma, mmccomas, vlaad, xtian, xxia
Target Milestone:	---
Target Release:	3.10.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:	An invalid SELinux context for the Docker engine prevented docker exec to work	Story Points:	---
Clone Of:
Clones:	1540288 (view as bug list)		Environment:
Last Closed:	2018-09-11 19:04:02 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1540288
Bug Blocks:

Description Johnny Liu 2017-11-24 10:31:49 UTC

Description of problem:
In a system container env, enable docker system container, `oc rsh` failed with "command terminated with exit code 129"


Version-Release number of the following components:
openshift v3.7.9
kubernetes v1.7.6+a08f5eeb62
etcd 3.2.8
openshift-ansible-3.7.9-1.git.4.d445616.el7.noarch
registry.access.stage.redhat.com/openshift3/container-engine:v3.7     (7845d4e6c495)

How reproducible:
Always

Steps to Reproduce:
1. deploy a system container env with docker system container enabled.
2. after installation, run `oc rsh` command

Actual results:
# oc rsh docker-registry-1-spnp5
command terminated with exit code 129

# oc exec -ti docker-registry-1-w2kt7 -- ls
command terminated with exit code 129

# oc exec docker-registry-1-w2kt7 -- ls
bin
boot
config.yml
dev
etc
home
lib
lib64
lost+found
media
mnt
opt
proc
registry
root
run
sbin
srv
sys
tmp
usr
var

Try to run "docker exec -ti" command, failed with the same return value.
# docker exec -ti 492ae2b19e1c rpm -q kernel

# echo $?
129


Expected results:
oc rsh succeed

Additional info:
This issue does not happen in rpm install.
This issue does not happen in docker containerized install.
This issue does not happen in system container cluster with docker system container disabled.

Comment 1 Giuseppe Scrivano 2017-11-29 21:47:39 UTC

I see these errors from SELinux here:

Nov 29 21:46:47 rhel-atomic kernel: type=1400 audit(1511992007.268:191): avc:  denied  { read write } for  pid=32074 comm="sh" path="/dev/pts/3" dev="devpts" ino=6 scontext=system_u:system_r:svirt_lxc_net_t:s0:c0,c1 tcontext=system_u:object_r:devpts_t:s0 tclass=chr_file
Nov 29 21:46:47 rhel-atomic kernel: type=1400 audit(1511992007.274:192): avc:  denied  { read write } for  pid=32074 comm="sh" path="/dev/pts/3" dev="devpts" ino=6 scontext=system_u:system_r:svirt_lxc_net_t:s0:c0,c1 tcontext=system_u:object_r:devpts_t:s0 tclass=chr_file
Nov 29 21:46:47 rhel-atomic kernel: type=1400 audit(1511992007.278:193): avc:  denied  { read write } for  pid=32074 comm="sh" path="/dev/pts/3" dev="devpts" ino=6 scontext=system_u:system_r:svirt_lxc_net_t:s0:c0,c1 tcontext=system_u:object_r:devpts_t:s0 tclass=chr_file
Nov 29 21:46:47 rhel-atomic kernel: type=1400 audit(1511992007.283:194): avc:  denied  { read write } for  pid=32074 comm="sh" path="/dev/pts/3" dev="devpts" ino=6 scontext=system_u:system_r:svirt_lxc_net_t:s0:c0,c1 tcontext=system_u:object_r:devpts_t:s0 tclass=chr_file


Can you confirm you see the same issue?

Comment 2 Giuseppe Scrivano 2017-11-30 12:12:07 UTC

Dan, is this something we can fix in container-selinux?

Comment 3 Daniel Walsh 2017-11-30 12:29:50 UTC

Are these volume mounted into the container?

Comment 4 Giuseppe Scrivano 2017-11-30 13:26:48 UTC

An issue I've seen with SELinux is that the container-engine for RHEL is using "selinuxProcessLabel" instead of "selinuxLabel":

https://github.com/projectatomic/atomic-system-containers/pull/82/commits/6cc6c0be1809d8bf4b52c494f31567a2f25383f8

Jhon, could you please verify this?

But this fix is not enough to get "docker exec -ti" to work, I still see:

Nov 30 13:23:55 rhel-atomic kernel: type=1400 audit(1512048235.441:925): avc:  denied  { read write } for  pid=32852 comm="echo" path="/dev/pts/1" dev="devpts" ino=4 scontext=system_u:system_r:svirt_lxc_net_t:s0:c0,c1 tcontext=system_u:object_r:devpts_t:s0 tclass=chr_file

The Docker container seems to get the wrong process label:

"MountLabel": "system_u:object_r:svirt_sandbox_file_t:s0:c1,c0",
 "ProcessLabel": "system_u:system_r:svirt_lxc_net_t:s0:c1,c0",


What is triggering the transaction to svirt_lxc_net_t?

Comment 5 Daniel Walsh 2017-11-30 18:02:12 UTC

So  your seeing this on a docker exec -ti ?

I have never seen those AVC's?

Comment 6 Giuseppe Scrivano 2017-11-30 18:28:58 UTC

yes, these happen on "docker exec -ti" when running as a system container.

I believe it is related to the files being labelled unconfined_u:object_r:container_share_t:s0 under /var/lib/containers/atomic.  In facts if I change the label:

chcon system_u:object_r:container_runtime_exec_t:s0 /var/lib/containers/atomic/container-engine.0/rootfs/usr/libexec/docker/docker-runc-current /var/lib/containers/atomic/container-engine.0/rootfs/usr/libexec/docker/*

then the issue doesn't happen.

To keep the label when executing these files we will probably need to do some changes in Docker so that /proc/self/task/self/attr/exec is properly set.

Comment 7 Giuseppe Scrivano 2017-12-01 11:47:56 UTC

so the problem seems to be that runc gets the spc_t when executed from containerd.

I tried to solve this problem in containerd but it is a quite invasive change.  We need to force each exec to keep the runtime_t label.  It requires changes into runc as well as runC forks/execs itself.

Otherwise we will need to define a way for system containers to relabel files so that not every file is `unconfined_u:object_r:container_share_t:s0`.  This is probably a long term solution, but it looks like a mess to manage, we would need to find a place where to set this metadata, set the label on files when they are pulled to ostree and also avoid it can be later changed (e.g. restorecon).

Comment 8 Daniel Walsh 2017-12-01 15:31:19 UTC

No we should be able solve this using SELinux policy. 

What exactly is systemd executing runc? And it is executing it with a label of continer_runtime_t correct?

If I look at the processes running in the system container for docker are they running as container_runtime_t?

Comment 9 Giuseppe Scrivano 2017-12-01 17:49:18 UTC

docker and containerd are running as container_runtime_t.  The problem is that once containerd executes runc (which is labelled container_share_t as any other file under /var/lib/containers/atomic), runc gets spc_t, and then I believe once runc forks/execs itself, it gets svirt_lxc_net_t.

We could probably fix it in containerd, but the same change needs to go into runc when it executes itself.

I've tried something like this in containerd, but still doesn't solve the issue, as runc must be modified as well.

+       if selinux.SelinuxEnabled() {
+               label, err := selinux.Getcon()
+               if err != nil {
+                       return err
+               }
+               if err := selinux.Setexeccon(label); err != nil {
+                       return err
+               }
+       }

Comment 10 Giuseppe Scrivano 2017-12-08 12:04:10 UTC

Dan, is it fine to move this bug to container-selinux?

Comment 11 Daniel Walsh 2017-12-09 02:43:25 UTC

If we label runc as container_runtime_exec_t, does that fix the problem?  What is the path under /var/lib/container/atomic?

Comment 12 Giuseppe Scrivano 2017-12-11 08:52:33 UTC

runc is already labelled as container_runtime_exec_t but that is not enough since files under /var/lib/containers/atomic have the unconfined_u:object_r:container_share_t:s0 label.

When containerd executes runc, the runc process is automatically labelled spc_t (container_runtime_exec_t + share_t => spc_t).

runc then forks and re-execs itself and the process becomes svirt_lxc_net (spc_t + container_share_t => container_share_t)

Comment 13 Daniel Walsh 2017-12-11 15:31:23 UTC

Sure, I am asking if you could label
/var/lib/containers/atomic/.../usr/bin/runc as container_runtime_exec_t, and see if this fixes the issue?

Comment 14 Giuseppe Scrivano 2017-12-11 16:11:23 UTC

yes, sorry for the misunderstanding.

If I do:

# chcon system_u:object_r:container_runtime_exec_t:s0 /var/lib/containers/atomic/*/{rootfs/usr/libexec/docker/*,rootfs/usr/bin/docker*}

# systemctl start container-engine

Then it works well for me.

Comment 15 Giuseppe Scrivano 2017-12-11 16:15:32 UTC

I need to add a note to the previous comment.

The sequence I tried is only for troubleshooting, since the files are stored in OSTree we would need to label them before committing them to the repository.

In general I think it is good if the system container files could have the same labels as on the host, as the ostree deduplication with the host would really work.

Comment 16 Daniel Walsh 2017-12-11 22:25:15 UTC

Well if they had the same labels as the host, that would probaly work, since they would be labeled as if they were in /usr and that is how runc would be labeled.  The problem is if they are not on the host then how should they be labeled.

Labeling the default as in container_share_t or container_var_lib_t is wrong, at least in this example since we have a transition rule that says when container_runtime_t executes container_share_t it will execute it as spc_t

Comment 17 Giuseppe Scrivano 2017-12-11 23:15:17 UTC

would it be possible to have the files in the system container be labelled as if they would be on the host?  In other words, could it be possible to configure selinux in a way that the prefix /var/lib/containers/atomic/*/rootfs/ is dropped? 

We could label the files when they are first imported into OSTree and that can be easily done with selabel_lookup_raw() and setfscreatecon_raw() from containers/image but we would also need to maintain the correct labels if the user uses restorecon.  Could that be easily achieved?

Comment 18 Giuseppe Scrivano 2017-12-12 18:53:01 UTC

proposed patch for fixing it in containers/image:

https://github.com/containers/image/pull/389

Comment 19 Johnny Liu 2018-01-22 08:22:16 UTC

Re-run a system container install with container-engine enabled, still failed.
# oc rsh docker-registry-1-8zcgt
command terminated with exit code 129


# openshift version
openshift v3.9.0-0.22.0
kubernetes v1.9.1+a0ce1bc657
etcd 3.2.8

# rpm -q atomic
atomic-1.20.1-9.git436cf5d.el7.x86_64

# atomic images list
<--snip-->
>  brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/container-engine              v3.9     122bc4a0f7fd   2018-01-22 02:34   126.8 MB       ostree
<--snip-->

I want to know which rpm/component the fix PR would land on and what is the fix version?

Comment 20 Giuseppe Scrivano 2018-01-22 08:24:55 UTC

the fix didn't hit Skopeo yet:

https://github.com/projectatomic/skopeo/pull/470

Comment 22 Johnny Liu 2018-01-23 02:49:09 UTC

Hi Giuseppe,

I am confused by this fix, how the skopeo fix could resolve my initial report, where skopeo was used in a OCP install? If I want to verify this bug, I need assure which rpm/image include this fix, what should be the fixed version?

Comment 23 Giuseppe Scrivano 2018-01-23 08:12:44 UTC

Skopeo is used to pull system containers to the ostree storage.

The problem was in the way the imported files for a system container were SELinux labelled.  To be able to maintain the label required by the docker runtime we need to be sure the files are labelled in the correct way.  To solve this we need to relabel the files on import, which is done by Skopeo.

How do we track this correctly?  Do we move it to Skopeo or create a new BZ for Skopeo and make this blocked on the new BZ?

Comment 24 Johnny Liu 2018-01-23 09:32:48 UTC

(In reply to Giuseppe Scrivano from comment #23)
> Skopeo is used to pull system containers to the ostree storage.
> 
> The problem was in the way the imported files for a system container were
> SELinux labelled.  To be able to maintain the label required by the docker
> runtime we need to be sure the files are labelled in the correct way.  To
> solve this we need to relabel the files on import, which is done by Skopeo.
> 
As far as I know, the image pull is done by `atomic pull` command in the whole installation, do I miss something? Or "atomic pull" command would call Skopeo silently?

> How do we track this correctly?  Do we move it to Skopeo or create a new BZ
> for Skopeo and make this blocked on the new BZ?
If this bug is depending on Skopeo, I proposed to open a new BZ for Skopeo to make sure the fix on Skopeo is really released out in extra channel, and make this blocked on the new Skopeo BZ.

Comment 25 Johnny Liu 2018-02-23 05:56:18 UTC

Re-run testing with the latest skopeo on RHEL75, still reproduce, seem like the fix is not landed into rhel75 yet, once 1540288 is fixed, will re-run.


# rpm -qa|grep -i Skopeo
skopeo-containers-0.1.28-1.git0270e56.el7.x86_64
skopeo-0.1.28-1.git0270e56.el7.x86_64

# rpm -q atomic
atomic-1.22.1-1.gitd36c015.el7.x86_64


# openshift version
openshift v3.9.0-0.48.0
kubernetes v1.9.1+a0ce1bc657
etcd 3.2.8

Comment 35 Laurie Friedman 2018-03-08 16:21:57 UTC

Hripps - does this mean this is no longer a blocker for RHEL 7.4 and OCP 3.9?  We still need to know if this is a problem with 7.4 (Johnny is testing).  But if it is a problem in 7.4, do we still need a 7.4.5.1 release to fix it for OCP 3.9?
Will OCP 3.10 run on 7.4 or only 7.5+?

Comment 36 N. Harrison Ripps 2018-03-08 16:25:06 UTC

@Laurie - this is no longer a blocker for OCP 3.9. I can't speak to whether or not it is a blocker for 7.4. I am tagging Jeff Ligon who may be able to comment on that.

Comment 37 Jeff Ligon 2018-03-08 16:33:17 UTC

This is working for 7.4.5 so should NOT be a blocker for 7.4 or OCP 3.9

I don't believe we need a 7.4.5.1 due to this bug.

Comment 39 Johnny Liu 2018-03-08 17:57:05 UTC

For RHEL 7.4 + OCP 3.9 testing, it is working well whatever on the latest rhel74 or a older rhel74, seem like skopeo-0.1.28-1.git0270e56 will fix this issue.


Older rhel74:
[root@qe-jialiu391rhl74bad-master-etcd-1 ~]# oc rsh docker-registry-1-j2x6s
sh-4.2$ exit
exit

[root@qe-jialiu391rhl74bad-master-etcd-1 ~]# rpm -qa selinux\* container\* atomic\* skopeo\* docker\* | sort
atomic-1.22.1-1.gitd36c015.el7.x86_64
atomic-openshift-docker-excluder-3.9.3-1.git.0.6a743cc.el7.noarch
atomic-openshift-excluder-3.9.3-1.git.0.6a743cc.el7.noarch
atomic-registries-1.22.1-1.gitd36c015.el7.x86_64
container-selinux-2.42-1.gitad8f0f7.el7.noarch
container-storage-setup-0.8.0-3.git1d27ecf.el7.noarch
docker-1.13.1-53.git774336d.el7.x86_64
docker-client-1.13.1-53.git774336d.el7.x86_64
docker-common-1.13.1-53.git774336d.el7.x86_64
docker-rhel-push-plugin-1.13.1-53.git774336d.el7.x86_64
selinux-policy-3.13.1-166.el7_4.7.noarch
selinux-policy-targeted-3.13.1-166.el7_4.7.noarch
skopeo-0.1.28-1.git0270e56.el7.x86_64
skopeo-containers-0.1.28-1.git0270e56.el7.x86_64

 
Latest rhel74:
[root@host-192-168-100-3 ~]# oc rsh docker-registry-1-qdld8
sh-4.2$ exit
exit
[root@host-192-168-100-3 ~]# rpm -qa selinux\* container\* atomic\* skopeo\* docker\* | sort
atomic-1.22.1-1.gitd36c015.el7.x86_64
atomic-openshift-docker-excluder-3.9.3-1.git.0.6a743cc.el7.noarch
atomic-openshift-excluder-3.9.3-1.git.0.6a743cc.el7.noarch
atomic-registries-1.22.1-1.gitd36c015.el7.x86_64
container-selinux-2.42-1.gitad8f0f7.el7.noarch
container-storage-setup-0.8.0-3.git1d27ecf.el7.noarch
docker-1.13.1-53.git774336d.el7.x86_64
docker-client-1.13.1-53.git774336d.el7.x86_64
docker-common-1.13.1-53.git774336d.el7.x86_64
docker-rhel-push-plugin-1.13.1-53.git774336d.el7.x86_64
selinux-policy-3.13.1-166.el7_4.9.noarch
selinux-policy-targeted-3.13.1-166.el7_4.9.noarch
skopeo-0.1.28-1.git0270e56.el7.x86_64
skopeo-containers-0.1.28-1.git0270e56.el7.x86_64



For RHEL 7.5 + OCP 3.9 testing, it is blocked by BZ#1550332, according to comment 26 in BZ#1550332, seem like a newer container-selinux (2.51) is fixing the blocker bug, but for now, I only could get container-selinux-2.50-1.el7 from RHEL-7.5-20180228.1, once get newer RHEL puddle including the fix, will re-test it.

Comment 40 Johnny Liu 2018-03-12 06:08:36 UTC

For RHEL 7.5 (RHEL-7.5-20180308.1) + OCP 3.9 testing, now it is working well.

[root@host-192-168-100-8 ~]# oc rsh docker-registry-1-jsg8v
sh-4.2$ exit
exit


[root@host-192-168-100-8 ~]# rpm -qa selinux\* container\* atomic\* skopeo\* docker\* | sort
atomic-1.22.1-1.gitd36c015.el7.x86_64
atomic-openshift-docker-excluder-3.9.4-1.git.0.35fdfc4.el7.noarch
atomic-openshift-excluder-3.9.4-1.git.0.35fdfc4.el7.noarch
atomic-registries-1.22.1-1.gitd36c015.el7.x86_64
container-selinux-2.51-1.el7.noarch
container-storage-setup-0.9.0-1.rhel75.gite0997c3.el7.noarch
docker-1.13.1-56.git774336d.el7.x86_64
docker-client-1.13.1-56.git774336d.el7.x86_64
docker-common-1.13.1-56.git774336d.el7.x86_64
docker-rhel-push-plugin-1.13.1-56.git774336d.el7.x86_64
selinux-policy-3.13.1-192.el7.noarch
selinux-policy-targeted-3.13.1-192.el7.noarch
skopeo-0.1.28-1.git0270e56.el7.x86_64
skopeo-containers-0.1.28-1.git0270e56.el7.x86_64


[root@host-192-168-100-8 ~]# openshift version
openshift v3.9.4
kubernetes v1.9.1+a0ce1bc657
etcd 3.2.16


[root@host-192-168-100-8 ~]# rpm -q kernel
kernel-3.10.0-860.el7.x86_64