Bug 1941380

Summary:	Podman - secondary groups not available in container when using userns=keep-id
Product:	Red Hat Enterprise Linux 8	Reporter:	Suhaas Bhat <subhat>
Component:	podman	Assignee:	Tom Sweeney <tsweeney>
Status:	CLOSED ERRATA	QA Contact:	Yuhui Jiang <yujiang>
Severity:	medium	Docs Contact:	Gabriela Nečasová <gnecasov>
Priority:	unspecified
Version:	8.3	CC:	bbaude, dornelas, dwalsh, gnecasov, jligon, jnovy, lsm5, mheon, pthomas, tsweeney, umohnani, ypu
Target Milestone:	rc
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	podman-3.2	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2021-11-09 17:37:14 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1186913

Description Suhaas Bhat 2021-03-22 01:18:17 UTC

Description of problem:
We are trying to bind a directory from the host operating system into the container. The user who starts the container has write access
to a directory based on secondary groups. The user inside the container should have the same secondary groups.

https://github.com/containers/podman/issues/4185  and http://docs.podman.io/en/latest/markdown/podman-run.1.html explain to use crun and an annotation to keep the secondary groups inside of the container.

However, this does not work on RHEL 8.3 using podman 2.0.5 and crun  0.14.1.  SELinux is diabled on the host system. 



Version-Release number of selected component (if applicable):
RHEL 8.3 
podman 2.0.5

How reproducible:
Fully

Steps to Reproduce:

[xxxx@host ~]$ grep Groups /proc/self/status
Groups: 1001 1003 1004
[xxxx@host ~]$ id
uid=1001(xxxx) gid=1001(xxxx) groups=1001(xxxx),1003(group1),1004(group2)


[xxxx@host ~]$  podman --runtime /usr/bin/crun run -it  --annotation run.oci.keep_original_groups=1   --userns=keep-id rhscl/python-36-rhel7 /bin/bash
(app-root) grep Groups /proc/self/status
Groups: 1001 65534 65534
(app-root) id
uid=1001(default) gid=0(root) groups=0(root),1001,65534


Where are you experiencing the behavior? What environment?
Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.15.1
  cgroupVersion: v1
  conmon:
    package: conmon-2.0.20-2.module+el8.3.0+8221+97165c3f.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.20, commit: 77ce9fd1e61ea89bd6cdc621b07446dd9e80e5b6'
  cpus: 80
  distribution:
    distribution: '"rhel"'
    version: "8.3"
  eventLogger: file
  hostname: xxxx
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1001
      size: 1
    - container_id: 1
      host_id: 165536
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1001
      size: 1
    - container_id: 1
      host_id: 165536
      size: 65536
  kernel: 4.18.0-240.10.1.el8_3.x86_64
  linkmode: dynamic
  memFree: 502806974464
  memTotal: 540653506560
  ociRuntime:
    name: runc
    package: runc-1.0.0-68.rc92.module+el8.3.0+8221+97165c3f.x86_64
    path: /usr/bin/runc
    version: 'runc version spec: 1.0.2-dev'
  os: linux
  remoteSocket:
    path: /run/user/1001/podman/podman.sock
  rootless: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.4-2.module+el8.3.0+8221+97165c3f.x86_64
    version: |-
      slirp4netns version 1.1.4
      commit: b66ffa8e262507e37fca689822d23430f3357fe8
      libslirp: 4.3.1
      SLIRP_CONFIG_VERSION_MAX: 3
  swapFree: 4294963200
  swapTotal: 4294963200
  uptime: 313h 27m 23.61s (Approximately 13.04 days)
registries:
  search:
  - registry.access.redhat.com
  - registry.redhat.io
  - docker.io
store:
  configFile: /home/xxxx/.config/containers/storage.conf
  containerStore:
    number: 3
    paused: 0
    running: 0
    stopped: 3
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-1.1.2-3.module+el8.3.0+8221+97165c3f.x86_64
      Version: |-
        fuse-overlayfs: version 1.1.0
        FUSE library version 3.2.1
        using FUSE kernel interface version 7.26
  graphRoot: /home/xxxx/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 11
  runRoot: /run/user/1001
  volumePath: /home/xxxx/.local/share/containers/storage/volumes
version:
  APIVersion: 1
  Built: 1600877882
  BuiltTime: Wed Sep 23 18:18:02 2020
  GitCommit: ""
  GoVersion: go1.14.7
  OsArch: linux/amd64
  Version: 2.0.5

Actual results:
secondary groups not available in container when using userns=keep-id

Expected results:
secondary groups should be available in container when using userns=keep-id

Additional info:

Comment 1 Matthew Heon 2021-03-22 13:37:42 UTC

I think this is not a bug in the sense that the container is behaving as expected. However, we should keep this open as it reveals an issue with our documentation that should be addressed.

In brief: the container process has retained the groups of the launching process (`podman run` as the user in question), but does not have the appropriate permissions to actually add them to the rootless user namespace. The rootless user namespace, in case you are unaware, is how a non-root user on the system gains access to more than one UID and GID to launch containers; the user is assigned a number of users it can access in /etc/subuid and /etc/subgid, and these (plus the user's own UID and GID) are mapped into the rootless container. This mapping does not match the allocation on the host - by default the user launching the container has their UID/GID mapped to root in the container, and the subuid/subgid entries are added starting at UID/GID 1. The important bit here is that these are the only host UIDs and GIDs the container can actually see - we cannot add any others for security reasons.

On to the groups that we leak: while the kernel does know that the process we've launched is in a user namespace, for the most part it is treated like any other process on the system. The list of groups it is present in is stored separately from the user namespace mappings - so a process can be part of a group, but not see it (the user namespace maps users not mapped into the namespace to 65534 - hence the output of your `id` command showing the process was part of 65534). If you were to get the container's PID on the host with `podman inspect` and then look at `/proc/$PID/status` on the host for that PID, I think you would see the list of groups you are expecting - the container still has access to them, but we can't actually see that inside the container.

The unfortunate limitation here is that we cannot chown existing or new files to those groups. The kernel squashed them all down to a single GID that means "not mapped", and we can't chown to that because it can contain multiple UIDs and GIDs - the kernel won't let us. So this flag grants the container the ability to act as the original groups of the user launching, but not the ability to create new files with those UIDs or GIDs. We recognize that this is very confusing and blocks some usecases, but security requires that things be done this way.

Comment 4 Daniel Walsh 2021-03-23 20:30:24 UTC

I have begun work on a PR to make this easier to do https://github.com/containers/podman/pull/9495 or at least easier for users to discover,
Sadly between PTO and other work, I have not gotten back to it in a couple of weeks.

Comment 5 Jindrich Novy 2021-07-21 11:10:00 UTC

Dan, seeing https://github.com/containers/podman/pull/9495 merged, can we consider this ready for QE?

Comment 6 Daniel Walsh 2021-07-21 16:43:53 UTC

Yes

Comment 7 Jindrich Novy 2021-07-21 16:45:54 UTC

Can we get qa ack please?

Comment 14 errata-xmlrpc 2021-11-09 17:37:14 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: container-tools:rhel8 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:4154