Not using pivot_root(2) leaves the host /proc around in the mount namespace so that it is possible to mount another /proc without any other submount, even if /proc in the container is not fully visible. This flaw allows an attacker to read and modify some parts of the Linux kernel memory.
Name: the Kubernetes Product Security Team
Upstream: Akihiro Suda
Can't access the link? Could you tell me what SELinux label is running when you break out? `id -Z`
Just curious about whether SELinux blocks the exploit, or at least confines it.
I requested access by granted to the runc directory of SRTVULNS repo, but it might take a while to be applied.
In the meantime, I see that I don't have any SELinux context applied:
$ cat /etc/redhat-release
Fedora release 29 (Twenty Nine)
$ runc --version
runc version 1.0.0-rc6+dev
$ docker create --name BZ1663068 fedora sh
$ docker export BZ1663068 > BZ1663068.tar
$ docker rm -f BZ1663068
$ mkdir rootfs
$ tar -xf BZ1663068.tar -C rootfs
$ runc spec --rootless
$ runc --root /tmp/runc run --no-pivot BZ1663068
sh-4.4# id -Z
id: --context (-Z) works only on an SELinux-enabled kernel
My kernel does have SELinux-enabled, however I'm not sure if runc was compiled with SELinux enabled or not.
Sorry id -Z might be lying to you.
Inside the container do:
Outside do getenforce, to make sure SELinux is enabled.
Seems it's current with 'unconfined_u:system_r:container_runtime_t:s0-s0:c0.c1023'
[root@runc /]# cat /proc/self/attr/current
[root@runc /]# exit
[jshepher@localhost BZ1663068]$ getenforce
CRI-O defaults to 'no_pivot'  being 'false', so it's not affected by this issue. Users should avoid setting 'no_pivot' to 'true' to prevent this flaw when using a vulnerable versions of runc with CRI-O.
Podman is not affected by this issue as it doesn't use 'runc run --no-pivot'.
Jason, if you run runc with a spec file that does not set a confined SELinux label or any at all, it runs as container_runtime_t.
If this was run by podman, CRI-O, Buildah or Docker, by default it would run as container_t, and escape would probably be prevented
or at least controlled. Since container_t is only able to read /usr and some of /etc, and only able to write to container_file_t.
Of course if you run on an SELinux disabled machine or permissive, or you run your container engine with SELinux disabled then
the escalation would not be prevented by SELinux.
Is is possible to make use of the '--no-pivot' option of runc when launching it via docker in RHEL? Trying to figure out if 'docker' on RHEL7 would be affected or not if running without SELinux enforcing.
With SELinux in enforcing mode, and a runc spec file that set a confined SELinux label the container would run with the container_t label and the escape would be prevented. Podman, CRI-O, Buildah or Docker, by default it would run as container_t, and escape would be prevented, since container_t is only able to read /usr and some of /etc, and only able to write to files labelled container_file_t.
Upstream Git Pull Request: https://github.com/opencontainers/runc/pull/1962
Created runc tracking bugs for this issue:
Affects: fedora-all [bug 1665769]
Upstream pointed out that run's `--no-pivot` flag is already known to be insecure as it allows access to the host filesystem from the container. Thus, this particular bug does not warrant a CVE.