Bug 2345676
Summary: | A vulnerability in Podman and crun allows containers with SYS_PTRACE to hijack host file descriptors (e.g., seccomp.bpf via /proc/[pid]/fd) during podman top execution, enabling seccomp bypass and container escape in all environments. | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | m202372036 |
Component: | podman | Assignee: | Lokesh Mandvekar <lsm5> |
Status: | CLOSED UPSTREAM | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | rawhide | CC: | bbaude, debarshir, dwalsh, go-sig, gscrivan, jnovy, lsm5, m202372036, mboddu, mheon, nsella, patrick, pholzing, suraj.ghimire7 |
Target Milestone: | --- | Keywords: | Reopened |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2025-02-18 17:01:16 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
m202372036
2025-02-14 05:25:02 UTC
how many times are you going to report the same "issue" through different channels? As I've already explained in the past, this is not a security issue, there is no way to protect the host when using CAP_SYS_PTRACE. If you pass that capability, you know that the container payload is trusted, it is equivalent to running on the host. You don't need such complicated attacks with SYS_PTRACE. You can simply attach a debugger to the exec'ed process as soon as it enters the namespace and run any command from there. It is enough you install gdb in your container, then attach the crun process as soon as it enters the PID namespace, at this point there is no seccomp profile in place as well as many other security measures (selinux, apparmor, capabilities...): (In reply to Giuseppe Scrivano from comment #1) > how many times are you going to report the same "issue" through different > channels? > > As I've already explained in the past, this is not a security issue, there > is no way to protect the host when using CAP_SYS_PTRACE. If you pass that > capability, you know that the container payload is trusted, it is equivalent > to running on the host. > > You don't need such complicated attacks with SYS_PTRACE. You can simply > attach a debugger to the exec'ed process as soon as it enters the namespace > and run any command from there. > > It is enough you install gdb in your container, then attach the crun process > as soon as it enters the PID namespace, at this point there is no seccomp > profile in place as well as many other security measures (selinux, apparmor, > capabilities...): We've adhered to community protocols by reporting this twice via email, only to face repeated dismissal. Your persistent refusal to acknowledge the issue forces public discourse. It is difficult to assume that all images used by users are so-called "trusted". In fact, even on personal computers, developers or testers need to download images from DockerHub. You cannot shift the security responsibility to users to avoid this design security issue. If this function is not designed to be secure, once a user accidentally uses a malicious image, it will inevitably cause an escape problem. Either engineer proper safeguards for CAP_SYS_PTRACE implementations or issue unambiguous warnings in documentation. Your current posture amounts to negligence: When (not if) users encounter malicious images through routine workflows, container escapes become inevitable. Security through willful ignorance isn't security at all. Nobody is proposing to trust all the images you pull from a remote registry, I am just saying to not give capabilities to containers you don't trust. You've proposed a complicated attack that works only when CAP_SYS_PTRACE is granted. You don't need such a complicated attack once you have CAP_SYS_PTRACE, you can attach to a process as soon as it enters the PID namespace. That is done by `podman top` as well as `podman exec` on every healthcheck. It is a well known attack vector, there is nothing new. Just look for "cap_sys_ptrace container escape" on Google. There is a reason why we drop some capabilities by default. If you add them back you are loosening the protection offered by the runtime. Don't grant capabilities if you don't know what you are doing. In this case, you have added a capability like CAP_SYS_PTRACE, that turns your container into a privileged container: --privileged Give extended privileges to this container. The default is false. By default, Podman containers are unprivileged (=false) and cannot, for example, modify parts of the operating system. This is because by default a container is only allowed limited access to devices. A "privileged" container is given the same ac‐ cess to devices as the user launching the container, with the exception of virtual consoles (/dev/tty\d+) when running in systemd mode (--systemd=always). A privileged container turns off the security features that isolate the container from the host. Dropped Capabilities, limited devices, read-only mount points, Apparmor/SELinux separation, and Seccomp filters are all disabled. Due to the disabled security features, the privileged field should almost never be set as containers can easily break out of confinement. documented upstream that there is a risk involved using these capabilities: https://github.com/containers/podman/pull/25348 Thanks |