Bug 2249514 - A FreeIPA user can't create --user scope units (or run rootless containers with Podman)
Summary: A FreeIPA user can't create --user scope units (or run rootless containers wi...
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Fedora
Classification: Fedora
Component: systemd
Version: 39
Hardware: Unspecified
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: systemd-maint
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-11-13 15:09 UTC by Sam Morris
Modified: 2023-11-15 14:46 UTC (History)
19 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2023-11-15 14:46:54 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
'busctl monitor org.freedesktop.systemd1' when called by systemd-run launched form an gdm session (64.04 KB, text/plain)
2023-11-15 13:06 UTC, Sam Morris
no flags Details
'busctl monitor org.freedesktop.systemd1' when called by systemd-run launched from an ssh session (53.74 KB, text/plain)
2023-11-15 13:09 UTC, Sam Morris
no flags Details

Description Sam Morris 2023-11-13 15:09:33 UTC
(This is probably not podman's fault but I'm not experienced enough to put it directly into the right component so bear with me).

On a newly installed Fedora 39 system, I log in with a FreeIPA user and I can't launch rootless containers:

  $ podman run --rm -it registry.access.redhat.com/ubi9/ubi
  Error: OCI runtime error: crun: error `creating` systemd unit `libpod-8eb0e2302fe6400e16115dbc32b2800e819d35c0bf35a1074e2015ee2bff4f2c.scope`: got `failed`

Corresponding log messages:

Nov 13 14:54:57 systemd[6789]: podman-102226.scope: Couldn't move process 102226 to requested cgroup '/user.slice/user-1673000001.slice/user/user.slice/podman-102226.scope' (directly or via the system bus): Input/output error
Nov 13 14:54:57 systemd[6789]: podman-102226.scope: Failed to add PIDs to scope's control group: Permission denied
Nov 13 14:54:57 systemd[6789]: podman-102226.scope: Failed with result 'resources'.
Nov 13 14:54:57 systemd[6789]: Failed to start podman-102226.scope.
Nov 13 14:54:57 podman[102226]: 2023-11-13 14:54:57.906778741 +0000 GMT m=+0.063767881 container create 1f50eb407b5453eb8f7f1ee0b7ca6b971c3446c6c2bc51680c9c829417976f03 (image=registry.access.redhat.com/ubi9/ubi:latest, name=ecstatic_shtern, com.redhat.component=ubi9-container, release=1361, architecture=x86_64, name=ubi9, maintainer=Red Hat, Inc., description=The Universal Base Image is designed and engineered to be the base layer for all of your containerized applications, middleware and utilities. This base image is freely redistributable, but Red Hat only supports Red Hat technologies through subscriptions for Red Hat products. This image is maintained by Red Hat and updated regularly., com.redhat.license_terms=https://www.redhat.com/en/about/red-hat-end-user-license-agreements#UBI, build-date=2023-11-01T01:38:36, vcs-type=git, io.k8s.display-name=Red Hat Universal Base Image 9, version=9.3, url=https://access.redhat.com/containers/#/registry.access.redhat.com/ubi9/images/9.3-1361, io.k8s.description=The Universal Base Image is designed and engineered to be the base layer for all of your containerized applications, middleware and utilities. This base image is freely redistributable, but Red Hat only supports Red Hat technologies through subscriptions for Red Hat products. This image is maintained by Red Hat and updated regularly., io.buildah.version=1.29.0, vcs-ref=eb726081eeafc660c182aae53074ec6631cb473e, io.openshift.expose-services=, vendor=Red Hat, Inc., summary=Provides the latest release of Red Hat Universal Base Image 9., io.openshift.tags=base rhel9, distribution-scope=public)
Nov 13 14:54:57 systemd[6789]: libpod-conmon-1f50eb407b5453eb8f7f1ee0b7ca6b971c3446c6c2bc51680c9c829417976f03.scope: Couldn't move process 102247 to requested cgroup '/user.slice/user-1673000001.slice/user/user.slice/libpod-conmon-1f50eb407b5453eb8f7f1ee0b7ca6b971c3446c6c2bc51680c9c829417976f03.scope' (directly or via the system bus): Input/output error
Nov 13 14:54:57 systemd[6789]: libpod-conmon-1f50eb407b5453eb8f7f1ee0b7ca6b971c3446c6c2bc51680c9c829417976f03.scope: Failed to add PIDs to scope's control group: Permission denied
Nov 13 14:54:57 systemd[6789]: libpod-conmon-1f50eb407b5453eb8f7f1ee0b7ca6b971c3446c6c2bc51680c9c829417976f03.scope: Failed with result 'resources'.
Nov 13 14:54:57 systemd[6789]: Failed to start libpod-conmon-1f50eb407b5453eb8f7f1ee0b7ca6b971c3446c6c2bc51680c9c829417976f03.scope.
Nov 13 14:54:57 systemd[6789]: libpod-1f50eb407b5453eb8f7f1ee0b7ca6b971c3446c6c2bc51680c9c829417976f03.scope: Couldn't move process 102250 to requested cgroup '/user.slice/user-1673000001.slice/user/user.slice/libpod-1f50eb407b5453eb8f7f1ee0b7ca6b971c3446c6c2bc51680c9c829417976f03.scope' (directly or via the system bus): Input/output error
Nov 13 14:54:57 systemd[6789]: libpod-1f50eb407b5453eb8f7f1ee0b7ca6b971c3446c6c2bc51680c9c829417976f03.scope: Failed to add PIDs to scope's control group: Permission denied
Nov 13 14:54:57 systemd[6789]: libpod-1f50eb407b5453eb8f7f1ee0b7ca6b971c3446c6c2bc51680c9c829417976f03.scope: Failed with result 'resources'.
Nov 13 14:54:57 systemd[6789]: Failed to start libpod-1f50eb407b5453eb8f7f1ee0b7ca6b971c3446c6c2bc51680c9c829417976f03.scope - libcrun container.
Nov 13 14:54:57 conmon[102248]: conmon 1f50eb407b5453eb8f7f <error>: Failed to receive console file descriptor Communication error on send
Nov 13 14:54:57 podman[102226]: 2023-11-13 14:54:57.890070831 +0000 GMT m=+0.047059963 image pull 5865cc7a5402a614e695617596d58ac8662fc15e32ede2127be993224c00650d registry.access.redhat.com/ubi9/ubi
Nov 13 14:54:58 podman[102226]: 2023-11-13 14:54:58.003573131 +0000 GMT m=+0.160562321 container remove 1f50eb407b5453eb8f7f1ee0b7ca6b971c3446c6c2bc51680c9c829417976f03 (image=registry.access.redhat.com/ubi9/ubi:latest, name=ecstatic_shtern, description=The Universal Base Image is designed and engineered to be the base layer for all of your containerized applications, middleware and utilities. This base image is freely redistributable, but Red Hat only supports Red Hat technologies through subscriptions for Red Hat products. This image is maintained by Red Hat and updated regularly., io.k8s.description=The Universal Base Image is designed and engineered to be the base layer for all of your containerized applications, middleware and utilities. This base image is freely redistributable, but Red Hat only supports Red Hat technologies through subscriptions for Red Hat products. This image is maintained by Red Hat and updated regularly., io.k8s.display-name=Red Hat Universal Base Image 9, io.openshift.expose-services=, vcs-type=git, architecture=x86_64, com.redhat.component=ubi9-container, com.redhat.license_terms=https://www.redhat.com/en/about/red-hat-end-user-license-agreements#UBI, url=https://access.redhat.com/containers/#/registry.access.redhat.com/ubi9/images/9.3-1361, summary=Provides the latest release of Red Hat Universal Base Image 9., vendor=Red Hat, Inc., distribution-scope=public, vcs-ref=eb726081eeafc660c182aae53074ec6631cb473e, release=1361, io.openshift.tags=base rhel9, name=ubi9, version=9.3, io.buildah.version=1.29.0, maintainer=Red Hat, Inc., build-date=2023-11-01T01:38:36)
Nov 13 14:54:58 systemd[6789]: podman-102252.scope: Couldn't move process 102252 to requested cgroup '/user.slice/user-1673000001.slice/user/user.slice/podman-102252.scope' (directly or via the system bus): Input/output error
Nov 13 14:54:58 systemd[6789]: podman-102252.scope: Failed to add PIDs to scope's control group: Permission denied
Nov 13 14:54:58 systemd[6789]: podman-102252.scope: Failed with result 'resources'.
Nov 13 14:54:58 systemd[6789]: Failed to start podman-102252.scope.

I attached strace to my user's systemd process and it seems to be unable to move the container's processes into the container's scope unit:

  6789  renameat(AT_FDCWD, "/run/user/1673000001/systemd/units/.#invocation:podman-102042.scopeb3c5b6f101f92a55", AT_FDCWD, "/run/user/1673000001/systemd/units/invocation:podman-102042.scope") = 0
  6789  openat(AT_FDCWD, "/sys/fs/cgroup/user.slice/user-1673000001.slice/user/user.slice/podman-102042.scope/cgroup.procs", O_WRONLY|O_NOCTTY|O_CLOEXEC) = 27
  6789  fcntl(27, F_GETFL)                = 0x8001 (flags O_WRONLY|O_LARGEFILE)
  6789  write(27, "102042\n", 7)          = -1 EACCES (Permission denied)
  6789  close(27)                         = 0

Which is also reproducible without needing podman at all:

  $ systemd-run  --scope --user echo hello 
  Job failed. See "journalctl -xe" for details.

Hence this is likely not actually podman's fault but I'm not sure which component is to blame. Maybe I am to blame and I've forgotten to add some configuration to allow non-local users to run containers!

Reproducible: Always

Steps to Reproduce:
1. Install Fedora 39
2. Log in with a user defined in FreeIPA
3. podman run --rm -it registry.access.redhat.com/ubi9/ubi
Actual Results:  
Error: OCI runtime error: crun: error `creating` systemd unit `libpod-7f7da66ab98c7f37436e35b382d23f1181095d448bbed02850f9f26da289ab35.scope`: got `failed`


Expected Results:  
Container should run

I configured authselect's sssd with-subid feature & podman is picking up the right subids via libsubid so this is _not_ related to subid configuration as far as I can tell:

  $  podman unshare cat /proc/self/uid_map 
         0 80000001            1
         1 2147483648      65536

A local user is able to launch containers just fine, and systemd-run also works:

$ systemd-run --scope --user echo hello
Running scope as unit: run-r1ef3641abdb74d75a1060ba881bc992d.scope
hello

Comment 1 Sam Morris 2023-11-13 15:10:47 UTC
This all worked fine on Fedora 38, it's only a fresh install of Fedora 39 that has this problem.

Comment 2 Daniel Walsh 2023-11-13 16:26:01 UTC
I would figure this is something about the systemd environment not being setup correctly, not Podman.

Comment 3 Daniel Walsh 2023-11-13 16:26:29 UTC
Giuseppe WDYT?

Comment 4 Sam Morris 2023-11-14 02:05:56 UTC
After more testing - systemd-run --scope (and also podman) work fine if I log in to GNOME. But when logging in via SSH I can't use systemd-run --scope/podman.

Comment 5 Giuseppe Scrivano 2023-11-14 11:16:49 UTC
It looks like an issue with your current cgroup, and the cgroup migration fails since you don't own the current cgroup and up to the common ancestor with the target cgroup

What is your current cgroup when you login from ssh?

Comment 6 Sam Morris 2023-11-14 14:08:36 UTC
(In reply to Giuseppe Scrivano from comment #5)

I think you're right!

> What is your current cgroup when you login from ssh?

With SSH:

  $ ps -ww -o cgroup $$
  CGROUP
  0::/user.slice/user-1673000001.slice/session-29.scope

  $ ls -l /sys/fs/cgroup/user.slice/user-1673000001.slice/cgroup.procs
  -rw-r--r--. 1 root root 0 Nov 10 08:50 /sys/fs/cgroup/user.slice/user-1673000001.slice/cgroup.procs

From GNOME:

  $ ps -ww -o cgroup $$
  CGROUP
  0::/user.slice/user-1673000001.slice/user/app.slice/app-org.gnome.Terminal.slice/tmux-spawn-e5d6255b-5171-4c57-9aa9-cf6a8be411c7.scope

  $ ls -l /sys/fs/cgroup/user.slice/user-1673000001.slice/user/cgroup.procs
  -rw-r--r--. 1 sam  sam  0 Nov 10 08:50 /sys/fs/cgroup/user.slice/user-1673000001.slice/user/cgroup.procs

Comment 7 Sam Morris 2023-11-14 17:20:07 UTC
(In reply to Giuseppe Scrivano from comment #5)
> What is your current cgroup when you login from ssh?

For completeness, here's the ownership of the lowest common ancestor cgroup on the Fedora 39 system as well:

  $ ls -ld /sys/fs/cgroup/user.slice/user-1673000001.slice /sys/fs/cgroup/user.slice/user-1673000001.slice/session-29.scope
  drwxr-xr-x. 6 root root 0 Nov 14 16:59 /sys/fs/cgroup/user.slice/user-1673000001.slice
  drwxr-xr-x. 2 root root 0 Nov 14 13:46 /sys/fs/cgroup/user.slice/user-1673000001.slice/session-29.scope

I've also tested this on a Fedora 38 system and although the current cgroup, its ownership and the ownership of the lowest common ancestor with the eventual target cgroup are the same as on Fedora 39, nonetheless systemd-run --user --scope works fine!

  $ ps -ww -o cgroup $$
  CGROUP
  0::/user.slice/user-1673000001.slice/session-24.scope

  $ ls -ld /sys/fs/cgroup/user.slice/user-1673000001.slice /sys/fs/cgroup/user.slice/user-1673000001.slice/session-24.scope
  drwxr-xr-x. 5 root root 0 Nov 14 17:03 /sys/fs/cgroup/user.slice/user-1673000001.slice
  drwxr-xr-x. 2 root root 0 Nov 14 17:03 /sys/fs/cgroup/user.slice/user-1673000001.slice/session-24.scope

  $ systemd-run --user --scope bash -c 'ps ww -o cgroup $$'
  Running scope as unit: run-rdebaa044aff54b0dab7d28f60ade1768.scope
  CGROUP
  0::/user.slice/user-1673000001.slice/user/app.slice/run-rdebaa044aff54b0dab7d28f60ade1768.scope

Comment 8 Giuseppe Scrivano 2023-11-14 20:10:54 UTC
what error do you get if you try running something like `systemd-run --user --scope true` on Fedora 39?

Comment 9 Sam Morris 2023-11-15 00:04:43 UTC
That gives me:

  $ systemd-run --user --scope true
  Job failed. See "journalctl -xe" for details.

And in the journal there is logged:

  Nov 15 00:01:20 systemd[6789]: run-r083ec00e284b4d07ba3bcc51f563c13b.scope: Couldn't move process 124266 to requested cgroup '/user.slice/user-1673000001.slice/user/app.slice/run-r083ec00e284b4d07ba3bcc51f563c13b.scope' (directly or via the system bus): Input/output error
  Nov 15 00:01:20 systemd[6789]: run-r083ec00e284b4d07ba3bcc51f563c13b.scope: Failed to add PIDs to scope's control group: Permission denied
  Nov 15 00:01:20 systemd[6789]: run-r083ec00e284b4d07ba3bcc51f563c13b.scope: Failed with result 'resources'.
  Nov 15 00:01:20 systemd[6789]: Failed to start run-r083ec00e284b4d07ba3bcc51f563c13b.scope - /usr/bin/ls -l /proc/self/ns/cgroup.

Comment 10 Giuseppe Scrivano 2023-11-15 08:12:51 UTC
Thanks for trying that out.  If that command doesn't work, then there is not much Podman can do.

Reassigning to systemd for further investigation.

Comment 11 Sam Morris 2023-11-15 10:19:56 UTC
I created a new IPA user for further troubleshooting purposes. When I SSH in as this user and run 'systemd-run --user --scope echo hello', it works!

After a bit more investigating I suspect that the cgroup hierarchy ownership is a red herring. This is because the new user's systemd user instance has the same error as my original user when trying to move the process into the scope unit's cgroup:

  129330 openat(AT_FDCWD, "/sys/fs/cgroup/user.slice/user-1673050501.slice/user/app.slice/run-r2d8560040e674659a11713d2cd466595.scope/cgroup.procs", O_WRONLY|O_NOCTTY|O_CLOEXEC) = 32
  129330 fcntl(32, F_GETFL)               = 0x8001 (flags O_WRONLY|O_LARGEFILE)
  129330 write(32, "132699\n", 7)         = -1 EACCES (Permission denied)
  129330 close(32)                        = 0

(Note the pid that systemd is trying to move into the scope unit is 132699)

... however this is not preventing systemd-run from working:

  testuser1@isis:~$ systemd-run --expand-environment=no --user --scope bash -c 'ps -ww -o pid,cgroup $$'
  Running scope as unit: run-r2d8560040e674659a11713d2cd466595.scope
      PID CGROUP
   132699 0::/user.slice/user-1673050501.slice/user/app.slice/run-r2d8560040e674659a11713d2cd466595.scope

(Note the same pid, 132699 is somehow present within the scope unit anyhow)

strace reveals the same behaviour on a Debian system too, where systemd-run also works.

I'm retitling this bug accordingly.

Comment 12 Sam Morris 2023-11-15 11:40:39 UTC
Increasing my affected user's systemd instance's log level debug shows the following when I use 'systemd-run --user --scope <command>'.

I'll put a * character in front of the interesting lines:

  Nov 15 10:46:13 systemd[6789]: Bus private-bus-connection: changing state UNSET → OPENING
  Nov 15 10:46:13 systemd[6789]: sd-bus: starting bus private-bus-connection on fds 16/16 (socket:[4338239], socket:[4338239])...
  Nov 15 10:46:13 systemd[6789]: Bus private-bus-connection: changing state OPENING → AUTHENTICATING
  Nov 15 10:46:13 systemd[6789]: Accepting direct incoming connection from 133475 (systemd-run) [n/a]
  Nov 15 10:46:13 systemd[6789]: Registering bus object implementation for path=/org/freedesktop/systemd1 iface=org.freedesktop.systemd1.Manager
  Nov 15 10:46:13 systemd[6789]: Registering bus object implementation for path=/org/freedesktop/systemd1/job iface=org.freedesktop.systemd1.Job
  Nov 15 10:46:13 systemd[6789]: Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Unit
  Nov 15 10:46:13 systemd[6789]: Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Automount
  Nov 15 10:46:13 systemd[6789]: Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Device
  Nov 15 10:46:13 systemd[6789]: Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Mount
  Nov 15 10:46:13 systemd[6789]: Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Path
  Nov 15 10:46:13 systemd[6789]: Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Scope
  Nov 15 10:46:13 systemd[6789]: Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Service
  Nov 15 10:46:13 systemd[6789]: Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Slice
  Nov 15 10:46:13 systemd[6789]: Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Socket
  Nov 15 10:46:13 systemd[6789]: Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Swap
  Nov 15 10:46:13 systemd[6789]: Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Target
  Nov 15 10:46:13 systemd[6789]: Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Timer
  Nov 15 10:46:13 systemd[6789]: Registering bus object implementation for path=/org/freedesktop/LogControl1 iface=org.freedesktop.LogControl1
  Nov 15 10:46:13 systemd[6789]: Accepted new private connection.
  Nov 15 10:46:13 systemd[6789]: Bus private-bus-connection: changing state AUTHENTICATING → RUNNING
  Nov 15 10:46:13 systemd[6789]: Got message type=method_call sender=n/a destination=org.freedesktop.systemd1 path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=StartTransientUnit  cookie=1 reply_cookie=0 signature=ssa(sv)a(sa(sv)) error-name=n/a error-message=n/a
  Nov 15 10:46:13 systemd[6789]: SELinux access check scon=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tcon=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=system perm=start state=enforcing function=method_start_transient_unit path=n/a cmdline=n/a: Success
  Nov 15 10:46:13 systemd[6789]: run-r8f7d623232a3493eba8fbfa8dc35863c.scope: Failed to load configuration: No such file or directory
  Nov 15 10:46:13 systemd[6789]: SELinux access check scon=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tcon=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=system perm=start state=enforcing function=bus_unit_queue_job path=/run/user/1673000001/systemd/transient/run-r8f7d623232a3493eba8fbfa8dc35863c.scope cmdline=n/a: Success
  Nov 15 10:46:13 systemd[6789]: run-r8f7d623232a3493eba8fbfa8dc35863c.scope: Trying to enqueue job run-r8f7d623232a3493eba8fbfa8dc35863c.scope/start/fail
  Nov 15 10:46:13 systemd[6789]: systemd-oomd.service: Cannot add dependency job, ignoring: Unit systemd-oomd.service not found.
  Nov 15 10:46:13 systemd[6789]: run-r8f7d623232a3493eba8fbfa8dc35863c.scope: Installed new job run-r8f7d623232a3493eba8fbfa8dc35863c.scope/start as 5704
  Nov 15 10:46:13 systemd[6789]: run-r8f7d623232a3493eba8fbfa8dc35863c.scope: Enqueued job run-r8f7d623232a3493eba8fbfa8dc35863c.scope/start as 5704
  Nov 15 10:46:13 systemd[6789]: Sent message type=signal sender=org.freedesktop.systemd1 destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=UnitNew cookie=1 reply_cookie=0 signature=so error-name=n/a error-message=n/a
  Nov 15 10:46:13 systemd[6789]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=UnitNew cookie=1172 reply_cookie=0 signature=so error-name=n/a error-message=n/a
  Nov 15 10:46:13 systemd[6789]: Sent message type=signal sender=org.freedesktop.systemd1 destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=JobNew cookie=2 reply_cookie=0 signature=uos error-name=n/a error-message=n/a
  Nov 15 10:46:13 systemd[6789]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=JobNew cookie=1173 reply_cookie=0 signature=uos error-name=n/a error-message=n/a
  Nov 15 10:46:13 systemd[6789]: Sent message type=method_return sender=org.freedesktop.systemd1 destination=n/a path=n/a interface=n/a member=n/a cookie=3 reply_cookie=1 signature=o error-name=n/a error-message=n/a
  Nov 15 10:46:13 systemd[6789]: Sent message type=signal sender=org.freedesktop.systemd1 destination=n/a path=/org/freedesktop/systemd1/unit/shutdown_2etarget interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=4 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
  Nov 15 10:46:13 systemd[6789]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/shutdown_2etarget interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=1174 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
  Nov 15 10:46:13 systemd[6789]: Sent message type=signal sender=org.freedesktop.systemd1 destination=n/a path=/org/freedesktop/systemd1/unit/app_2eslice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=5 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
  Nov 15 10:46:13 systemd[6789]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/app_2eslice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=1175 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
* Nov 15 10:46:13 systemd[6789]: Failed to read pids.max attribute of root cgroup, ignoring: No data available
* Nov 15 10:46:13 systemd[6789]: run-r8f7d623232a3493eba8fbfa8dc35863c.scope: Couldn't move process 133475 to directly requested cgroup '/user.slice/user-1673000001.slice/user/app.slice/run-r8f7d623232a3493eba8fbfa8dc35863c.scope': Permission denied
* Nov 15 10:46:13 systemd[6789]: run-r8f7d623232a3493eba8fbfa8dc35863c.scope: Couldn't move process 133475 to requested cgroup '/user.slice/user-1673000001.slice/user/app.slice/run-r8f7d623232a3493eba8fbfa8dc35863c.scope' (directly or via the system bus): Input/output error
  Nov 15 10:46:13 systemd[6789]: run-r8f7d623232a3493eba8fbfa8dc35863c.scope: Failed to add PIDs to scope's control group: Permission denied
  Nov 15 10:46:13 systemd[6789]: run-r8f7d623232a3493eba8fbfa8dc35863c.scope: Failed with result 'resources'.
  Nov 15 10:46:13 systemd[6789]: Sent message type=signal sender=org.freedesktop.systemd1 destination=n/a path=/org/freedesktop/systemd1/unit/run_2dr8f7d623232a3493eba8fbfa8dc35863c_2escope interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=6 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
  Nov 15 10:46:13 systemd[6789]: Sent message type=signal sender=org.freedesktop.systemd1 destination=n/a path=/org/freedesktop/systemd1/unit/run_2dr8f7d623232a3493eba8fbfa8dc35863c_2escope interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=7 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
  Nov 15 10:46:13 systemd[6789]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/run_2dr8f7d623232a3493eba8fbfa8dc35863c_2escope interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=1176 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
  Nov 15 10:46:13 systemd[6789]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/run_2dr8f7d623232a3493eba8fbfa8dc35863c_2escope interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=1177 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
  Nov 15 10:46:13 systemd[6789]: run-r8f7d623232a3493eba8fbfa8dc35863c.scope changed dead -> failed
  Nov 15 10:46:13 systemd[6789]: varlink: Sending message: {"method":"io.systemd.oom.ReportManagedOOMCGroups","parameters":{"cgroups":[{"mode":"auto","path":"/user.slice/user-1673000001.slice/user/app.slice/run-r8f7d623232a3493eba8fbfa8dc35863c.scope","property":"ManagedOOMSwap"},{"mode":"auto","path":"/user.slice/user-1673000001.slice/user/app.slice/run-r8f7d623232a3493eba8fbfa8dc35863c.scope","property":"ManagedOOMMemoryPressure"}]},"oneway":true}
  Nov 15 10:46:13 systemd[6789]: Sent message type=signal sender=org.freedesktop.systemd1 destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=8 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
  Nov 15 10:46:13 systemd[6789]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=1178 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
  Nov 15 10:46:13 systemd[6789]: run-r8f7d623232a3493eba8fbfa8dc35863c.scope: Job 5704 run-r8f7d623232a3493eba8fbfa8dc35863c.scope/start finished, result=failed
  Nov 15 10:46:13 systemd[6789]: Failed to start run-r8f7d623232a3493eba8fbfa8dc35863c.scope - /usr/bin/bash -c ps -ww -o cgroup $$.
  Nov 15 10:46:13 systemd[6789]: Sent message type=signal sender=org.freedesktop.systemd1 destination=n/a path=/org/freedesktop/systemd1/unit/run_2dr8f7d623232a3493eba8fbfa8dc35863c_2escope interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=9 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
  Nov 15 10:46:13 systemd[6789]: Sent message type=signal sender=org.freedesktop.systemd1 destination=n/a path=/org/freedesktop/systemd1/unit/run_2dr8f7d623232a3493eba8fbfa8dc35863c_2escope interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=10 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
  Nov 15 10:46:13 systemd[6789]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/run_2dr8f7d623232a3493eba8fbfa8dc35863c_2escope interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=1179 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
  Nov 15 10:46:13 systemd[6789]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/run_2dr8f7d623232a3493eba8fbfa8dc35863c_2escope interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=1180 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
  Nov 15 10:46:13 systemd[6789]: Sent message type=signal sender=org.freedesktop.systemd1 destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=JobRemoved cookie=11 reply_cookie=0 signature=uoss error-name=n/a error-message=n/a
  Nov 15 10:46:13 systemd[6789]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=JobRemoved cookie=1181 reply_cookie=0 signature=uoss error-name=n/a error-message=n/a
  Nov 15 10:46:13 systemd[6789]: run-r8f7d623232a3493eba8fbfa8dc35863c.scope: Unit entered failed state.
  Nov 15 10:46:13 systemd[6789]: Sent message type=signal sender=org.freedesktop.systemd1 destination=n/a path=/org/freedesktop/systemd1/unit/run_2dr8f7d623232a3493eba8fbfa8dc35863c_2escope interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=12 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
  Nov 15 10:46:13 systemd[6789]: Bus private-bus-connection: changing state RUNNING → CLOSING
  Nov 15 10:46:13 systemd[6789]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/run_2dr8f7d623232a3493eba8fbfa8dc35863c_2escope interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=1182 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
  Nov 15 10:46:13 systemd[6789]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/run_2dr8f7d623232a3493eba8fbfa8dc35863c_2escope interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=1183 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
  Nov 15 10:46:13 systemd[6789]: run-r8f7d623232a3493eba8fbfa8dc35863c.scope: Failed to send unit change signal for run-r8f7d623232a3493eba8fbfa8dc35863c.scope: Connection reset by peer
  Nov 15 10:46:13 systemd[6789]: Bus private-bus-connection: changing state CLOSING → CLOSED
  Nov 15 10:46:13 systemd[6789]: Got disconnect on private connection.

There's no log message "Failed to read pids.max attribute of root cgroup, ignoring: No data available" for the unaffected user, nor are either of the "Couldn't move process" messages logged.

Comment 13 Sam Morris 2023-11-15 12:07:47 UTC
Digging into systemd's code here... it looks like the failure to move the process into the scope untit's cgroup is expected. Systemd then attempts to retry this but asking pid 1 to do it via D-Bus and it's this operation that fails for my affected user:

(gdb) where
#0  unit_attach_pid_to_cgroup_via_bus (u=u@entry=0x55808cfe9c70, pid=pid@entry=134554, 
    suffix_path=suffix_path@entry=0x0) at ../src/core/cgroup.c:2403
#1  0x00007f9591c7b21d in unit_attach_pids_to_cgroup (u=0x55808cfe9c70, pids=0x55808d2a3f38, suffix_path=0x0)
    at ../src/core/cgroup.c:2487
#2  0x00007f9591d24a27 in scope_enter_running (s=0x55808cfe9c70) at ../src/core/scope.c:447
#3  0x00007f9591ce217f in job_perform_on_unit (j=j@entry=0x7ffeb6cf8128) at ../src/core/job.c:855
#4  0x00007f9591ce434c in job_run_and_invalidate (j=<optimized out>) at ../src/core/job.c:922
#5  0x00007f9591d085cd in manager_dispatch_run_queue (source=<optimized out>, userdata=<optimized out>)
    at ../src/core/manager.c:2420
#6  manager_dispatch_run_queue (source=<optimized out>, userdata=0x55808d1610d0) at ../src/core/manager.c:2410
#7  0x00007f9591abdc5d in source_dispatch (s=s@entry=0x55808d161b50) at ../src/libsystemd/sd-event/sd-event.c:4227
#8  0x00007f9591abe29d in sd_event_dispatch (e=<optimized out>, e@entry=0x55808d161880)
    at ../src/libsystemd/sd-event/sd-event.c:4808
#9  0x00007f9591ac10f8 in sd_event_run (e=<optimized out>, timeout=18446744073709551615)
    at ../src/libsystemd/sd-event/sd-event.c:4869
#10 0x00007f9591d0b135 in manager_loop (m=m@entry=0x55808d1610d0) at ../src/core/manager.c:3258
#11 0x000055808c319650 in invoke_main_loop (ret_error_message=0x7ffeb6cf83c8, 
    ret_switch_root_init=<synthetic pointer>, ret_switch_root_dir=<synthetic pointer>, ret_fds=0x7ffeb6cf83b8, 
    ret_retval=<synthetic pointer>, saved_rlimit_memlock=0x7ffeb6cf83e0, saved_rlimit_nofile=0x7ffeb6cf83f0, 
    m=0x55808d1610d0) at ../src/core/main.c:2012
#12 main (argc=2, argv=<optimized out>) at ../src/core/main.c:3178

(gdb) l 2403
2398	        assert(u);
2399	
2400	        if (MANAGER_IS_SYSTEM(u->manager))
2401	                return -EINVAL;
2402	
2403	        if (!u->manager->system_bus)
2404	                return -EIO;
2405	
2406	        if (!u->cgroup_path)
2407	                return -EINVAL;

(gdb) p u->manager->system_bus 
$25 = (sd_bus *) 0x0

u->manager->system_bus is NULL therefore unit_attach_pid_to_cgroup_via_bus returns -5.

When I try the same as an unaffected user, u->manager->system_bus is set as expected.

Comment 14 Sam Morris 2023-11-15 13:05:15 UTC
(Noting the following for context)

> Systemd then attempts to retry this

Specifically, the affected user's systemd instance tries to retry the move operation, by asking the system systemd instance to do it. But because u->manager->system_bus is unexpectedly NULL, it gives up before it asks.

The super weird thing is still that if systemd-run is run from within a graphical session, u->manager->system_bus is non-NULL and everything works. So the problem probably isn't within the user's systemd instance itself.

I guess when the user systemd instance creates the scope unit, it's examining some properties of the client process that called it (systemd-run) and there's a difference between the the process when launched from the ssh session vs the gdm session that causes this problem.

By adding --machine=@.host to systemd-run I'm able to use 'busctl monitor --user' to capture the traffic to/from the user systemd instance. I'll upload both the output when systemd-run is launched from the ssh session and the gdm session in case anyone can see a significant difference.

What next? I think a systemd maintainer can probably suggest the next steps. I'll wait to see if anyone responds here, and file an issue upstream if not.

Comment 15 Sam Morris 2023-11-15 13:06:45 UTC
Created attachment 1999557 [details]
'busctl monitor org.freedesktop.systemd1' when called by systemd-run launched form an gdm session

Comment 16 Sam Morris 2023-11-15 13:09:16 UTC
Created attachment 1999558 [details]
'busctl monitor org.freedesktop.systemd1' when called by systemd-run launched from an ssh session

Comment 17 Sam Morris 2023-11-15 14:46:54 UTC
I rebooted and can no longer reproduce. :(

I'll re-open if this happens again.


Note You need to log in before you can comment on or make changes to this bug.