Bug 1413075 - Cannot create scope with user service manager
Summary: Cannot create scope with user service manager
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: systemd
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: systemd-maint
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-01-13 14:53 UTC by Lukas Slebodnik
Modified: 2017-02-01 13:01 UTC (History)
8 users (show)

Fixed In Version: systemd-232-11.fc26
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-02-01 08:30:16 UTC
Type: Bug


Attachments (Terms of Use)

Description Lukas Slebodnik 2017-01-13 14:53:27 UTC
Description of problem:
I tried to create scope in user session; so I can kill it in case of leftover process. (That's part of development on my custom project)

There are some SELinux denials BZ1413047, but that can be "solved" with booting in permissive mode. However even in permissive mode systemd-run failed for me.

Version-Release number of selected component (if applicable):
systemd-232-6.fc26.x86_64

How reproducible:
Deterministic

Steps to Reproduce:
1. #boot machine in permissive mode to workaround BZ1413047
2. #login as ordinary user
3. systemd-run --user --scope --unit=test-build -- bash

Actual results:
Job for test-build.scope failed.
See "systemctl status test-build.scope" and "journalctl -xe" for details.

Expected results:
bash is executed

Additional info:
sh$ systemctl status --user test-build.scope
● test-build.scope - /usr/bin/bash
   Loaded: loaded (/run/user/1000/systemd/transient/test-build.scope; transient; vendor preset: enabled)
Transient: yes
   Active: failed (Result: resources)

Jan 13 15:50:44 host.example.com systemd[1748]: test-build.scope: Failed to add PIDs to scope's control group: Permission denied
Jan 13 15:50:44 host.example.com systemd[1748]: Failed to start /usr/bin/bash.
Jan 13 15:50:44 host.example.com systemd[1748]: test-build.scope: Unit entered failed state.

Comment 1 Lukas Slebodnik 2017-01-13 17:43:30 UTC
I do not have a problem (even in enforcing mode with older kernel 4.8.15-300.fc25.x86_64

The problem is only with 4.8.16-300.fc25.x86_64
Yes; kernel is from f25 and not from rawhide

Comment 2 Lukas Slebodnik 2017-01-13 20:02:54 UTC
and the same problem is with rawhide kernel

sh$ systemd-run --user --scope --unit=test-build -- bash
Job for test-build.scope failed.
See "systemctl status test-build.scope" and "journalctl -xe" for details.

sh$ uname -a
Linux vm-118.idm.lab.eng.brq.redhat.com 4.10.0-0.rc3.git1.1.fc26.x86_64 #1 SMP Tue Jan 10 15:32:37 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

sh$ rpm -q systemd
systemd-232-6.fc26.x86_64

Comment 3 Zbigniew Jędrzejewski-Szmek 2017-01-17 20:39:26 UTC
> Uailed to add PIDs to scope's control group: Permission denied

Unfortunately I cannot reproduce this. Can you try the following:
- in one terminal window: sudo strace -f -p$(systemctl --value -p MainPID show user@1000) -s100
- in another run the systemd-run command.

Comment 4 Lukas Slebodnik 2017-01-17 21:11:35 UTC
I can reproduce it on latest rawhide with SELinux in permissive mode (since boot)
on two machines (laptop and VM).

And as I mention in previous comment. My current workaround is to use older kernel from f25.

https://paste.fedoraproject.org/529144/87311148/

And related part of journal log

Jan 17 22:08:21 vm-119.example.com polkitd[691]: Registered Authentication Agent for unix-process:1714:146520 (system bus name :1.20 [/usr/bin/pkttyagent --notify-fd 5 --fallback], object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_US.utf8)
Jan 17 22:08:21 vm-119.example.com systemd[1178]: test-build3.scope: Failed to add PIDs to scope's control group: Permission denied
Jan 17 22:08:21 vm-119.example.com systemd[1178]: Failed to start /usr/bin/bash.
Jan 17 22:08:21 vm-119.example.com systemd[1178]: test-build3.scope: Unit entered failed state.
Jan 17 22:08:21 vm-119.example.com polkitd[691]: Unregistered Authentication Agent for unix-process:1714:146520 (system bus name :1.20, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_US.utf8) (disconnected from bus)

And thank you very much for looking into this bug.

Comment 5 Zbigniew Jędrzejewski-Szmek 2017-01-17 22:49:11 UTC
1178  open("/sys/fs/cgroup/systemd/user.slice/user-20728.slice/user@20728.service/test-build3.scope/cgroup.procs", O_WRONLY|O_NOCTTY|O_CLOEXEC) = 25
1178  write(25, "1714\n", 5)            = -1 EACCES (Permission denied)

Looks like selinux issue again.

Just to verify: you are not running with unified cgroup hierarchy? Can you paste your /proc/cmdline?

Comment 6 Lukas Slebodnik 2017-01-18 00:20:02 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #5)
> 1178 
> open("/sys/fs/cgroup/systemd/user.slice/user-20728.slice/user@20728.service/
> test-build3.scope/cgroup.procs", O_WRONLY|O_NOCTTY|O_CLOEXEC) = 25
> 1178  write(25, "1714\n", 5)            = -1 EACCES (Permission denied)
> 
> Looks like selinux issue again.
> 
But it is in permissive mode.

> Just to verify: you are not running with unified cgroup hierarchy? Can you
> paste your /proc/cmdline?

[root@vm-119 ~]# namei -mo /sys/fs/cgroup/systemd/user.slice/user-20728.slice/user@20728.service/test-build3.scope/cgroup.procs
f: /sys/fs/cgroup/systemd/user.slice/user-20728.slice/user@20728.service/test-build3.scope/cgroup.procs
 dr-xr-xr-x root     root        /
 dr-xr-xr-x root     root        sys
 drwxr-xr-x root     root        fs
 drwxr-xr-x root     root        cgroup
 dr-xr-xr-x root     root        systemd
 drwxr-xr-x root     root        user.slice
 drwxr-xr-x root     root        user-20728.slice
 drwxr-xr-x lslebodn dev-lab     user@20728.service
                                 test-build3.scope - No such file or directory

[root@vm-119 ~]# cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-4.10.0-0.rc3.git4.1.fc26.x86_64 root=/dev/mapper/vg_root-lv_root ro rd.lvm.lv=vg_root/lv_root rd.lvm.lv=vg_root/lv_swap rhgb quiet LANG=en_US.UTF-8 dhcpclass=lab-vms

Comment 7 Lukas Slebodnik 2017-01-18 11:43:07 UTC
And FYI,
audit log did not contain any new messages either.

Comment 8 Lukas Slebodnik 2017-01-18 14:35:42 UTC
(In reply to Lukas Slebodnik from comment #6)
> (In reply to Zbigniew Jędrzejewski-Szmek from comment #5)
> > 1178 
> > open("/sys/fs/cgroup/systemd/user.slice/user-20728.slice/user@20728.service/
> > test-build3.scope/cgroup.procs", O_WRONLY|O_NOCTTY|O_CLOEXEC) = 25
> > 1178  write(25, "1714\n", 5)            = -1 EACCES (Permission denied)
> > 
> > Looks like selinux issue again.
> > 
> But it is in permissive mode.
> 
> > Just to verify: you are not running with unified cgroup hierarchy? Can you
> > paste your /proc/cmdline?
> 
> [root@vm-119 ~]# namei -mo
> /sys/fs/cgroup/systemd/user.slice/user-20728.slice/user@20728.service/test-
> build3.scope/cgroup.procs
> f:
> /sys/fs/cgroup/systemd/user.slice/user-20728.slice/user@20728.service/test-
> build3.scope/cgroup.procs
>  dr-xr-xr-x root     root        /
>  dr-xr-xr-x root     root        sys
>  drwxr-xr-x root     root        fs
>  drwxr-xr-x root     root        cgroup
>  dr-xr-xr-x root     root        systemd
>  drwxr-xr-x root     root        user.slice
>  drwxr-xr-x root     root        user-20728.slice
>  drwxr-xr-x lslebodn dev-lab     user@20728.service
>                                  test-build3.scope - No such file or
> directory
> 

I broke the machine due to some testing. There fore I created new one.
And there are wrong permissions on user.service.

sh-4.4$ uname -a
Linux vm-140-014.idm.lab.eng.brq.redhat.com 4.10.0-0.rc4.git1.1.fc26.x86_64 #1 SMP Tue Jan 17 22:50:08 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

sh-4.4$ rpm -q systemd
systemd-232-7.fc26.x86_64

sh-4.4$ systemd-run --user --scope --unit=test -- bash
Job for test.scope failed.
See "systemctl status test.scope" and "journalctl -xe" for details.


sh-4.4$ namei -mo /sys/fs/cgroup/pids/user.slice/user-20728.slice/user@20728.service/test.scope
f: /sys/fs/cgroup/pids/user.slice/user-20728.slice/user@20728.service/test1.scope
 dr-xr-xr-x root root /
 dr-xr-xr-x root root sys
 drwxr-xr-x root root fs
 drwxr-xr-x root root cgroup
 dr-xr-xr-x root root pids
 drwxr-xr-x root root user.slice
 drwxr-xr-x root root user-20728.slice
 drwxr-xr-x root root user@20728.service
                      test.scope - No such file or directory

Comment 9 Zbigniew Jędrzejewski-Szmek 2017-01-18 19:37:46 UTC
(In reply to Lukas Slebodnik from comment #6)
> But it is in permissive mode.

Of course. Too many bugs open ;)

(In reply to Lukas Slebodnik from comment #6)
> I broke the machine due to some testing. There fore I created new one.
> And there are wrong permissions on user.service.

Right, but that's on the 'pids' controller. Permissions on the 'systemd' "controller"
are probably OK. I have the same permissions (root:root) on the
/sys/fs/cgroup/pids/user.slice/user-*.slice/user@*.service.

And in comment #c5 it is failing on
/sys/fs/cgroup/systemd/user.slice/user-20728.slice/user@20728.service/test-build3.scope/cgroup.procs,
which is the 'systemd' controller.

https://www.kernel.org/doc/Documentation/cgroup-v1/cgroups.txt
says:
> Note: Due to some restrictions enforced by some cgroup subsystems, moving
> a process to another cgroup can fail.

Do you have any non-default cgroup configuration (e.g. controllers, or systemd cgroup-related settings)?

Comment 10 Lukas Slebodnik 2017-01-19 06:55:56 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #9)
> https://www.kernel.org/doc/Documentation/cgroup-v1/cgroups.txt
> says:
> > Note: Due to some restrictions enforced by some cgroup subsystems, moving
> > a process to another cgroup can fail.
> 
> Do you have any non-default cgroup configuration (e.g. controllers, or
> systemd cgroup-related settings)?

Everything(cgroup related) is handled by systemd. I didn't configure anything special. Do you have an idea why it works with older kernel? 4.8.15-300.fc25.x86_64

Comment 11 Zbigniew Jędrzejewski-Szmek 2017-01-19 12:40:36 UTC
Yeah. In the meantime I spoke with Lennart and he says that there was some kernel change about the rules who can write that attribute when, then was reverted later on. He didn't remember any details, so I'll have to track this down.

Comment 12 Lukas Slebodnik 2017-01-20 08:24:37 UTC
Zbigniew,
do you have an idea why you cannot reproduce bug yourself?

Comment 13 Zbigniew Jędrzejewski-Szmek 2017-01-24 13:27:49 UTC
No. Please don't use needinfo for mundane questions.

Comment 14 Lukas Slebodnik 2017-01-24 13:43:12 UTC
I would not use needinfo if there was a willing to debug/troubleshoot this bug.
Or did I miss some request for data?

Comment 15 Zbigniew Jędrzejewski-Szmek 2017-01-24 14:10:48 UTC
Sorry for the delay. All that I currently know is stated in comment #11.

Comment 16 Lukas Slebodnik 2017-01-24 14:36:58 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #15)
> Sorry for the delay. All that I currently know is stated in comment #11.

Sure but does not explain why it works for you.

Comment 17 Zbigniew Jędrzejewski-Szmek 2017-01-24 22:23:33 UTC
There's nothing in the list of patches 4.8.15..4.8.16 that could explain the difference. There's also no change in the package in Fedora, apart from the update of upstream version.

Another observation is that for any given boot, this either works, or doesn't work, repeatably. So it doesn't seem to be a race condition at the time of scope creation.

systemd-229-16.fc24.x86_64
4.8.15-200.fc24.x86_64 → OK

systemd-232-7.fc26.x86_64
kernel-core-4.10.0-0.rc3.git4.1.fc26.x86_64 → OK

systemd-232-7.fc26.x86_64
kernel-core-4.8.16-300.fc25.x86_64 → broken

systemd-232-7.fc26.x86_64
kernel-core-4.8.15-300.fc25.x86_64 → broken

systemd-232-10.fc26.x86_64
kernel-core-4.10.0-0.rc3.git4.1.fc26.x86_64 → OK, OK

systemd-232-10.fc26.x86_64
kernel-core-4.10.0-0.rc5.git0.1.fc26.x86_64 → broken, broken, broken

hm, good:
$ cat /proc/4380/cgroup|sort -g
1:name=systemd:/user.slice/user-1001.slice/user@1001.service/init.scope
2:devices:/user.slice
3:cpuset:/
4:blkio:/user.slice
5:net_cls,net_prio:/
6:hugetlb:/
7:pids:/user.slice/user-1001.slice/user@1001.service
8:memory:/user.slice
9:perf_event:/
10:cpu,cpuacct:/user.slice
11:freezer:/

bad:
$ cat /proc/1146/cgroup |sort -g
0::/user.slice/user-1000.slice/user@1000.service/init.scope
1:cpu,cpuacct:/user.slice/user-1000.slice
2:freezer:/
3:pids:/user.slice/user-1000.slice/user@1000.service
4:memory:/user.slice/user-1000.slice
5:perf_event:/
6:blkio:/user.slice/user-1000.slice
7:net_cls,net_prio:/
8:cpuset:/
9:devices:/user.slice/user-1000.slice
10:hugetlb:/

Oh, this depends on systemd version in the initramfs.
current git → OK
systemd-232-7, -10 → bad

Hence, it's not really kernel-version-dependent.

Comment 18 Zbigniew Jędrzejewski-Szmek 2017-01-28 02:16:47 UTC
I think it's the interaction between the unified cgroup hierarchy used for the systemd tree, i.e. the so called hybrid mode. With systemd-232 we defaulted to it, then it got reverted [https://github.com/systemd/systemd/commit/843d5baf6a
]. I'll rebuild with that commit. Let's see if that fixes your issue.

Comment 19 Lukas Slebodnik 2017-01-30 12:20:02 UTC
Feel free to provide just a scratch build. And I assume I need to regenerate initramfs with dracut.

Comment 20 Zbigniew Jędrzejewski-Szmek 2017-01-30 14:32:22 UTC
Should be fixed in rawhide with systemd-232-11.fc26, please test.

Comment 21 Lukas Slebodnik 2017-01-30 16:49:06 UTC
sh# koji download-build --arch=x86_64 systemd-232-11.fc26
sh# dnf update *.rpm
sh# dracut --regenerate-all
sh# reboot

Then login as ordinary user:
[lslebodn@vm-118]$ systemd-run --user --scope --unit=test -- bash
Job for test.scope failed.
See "systemctl status test.scope" and "journalctl -xe" for details.
[lslebodn@vm-118]$ getenforce 
Permissive

Did I something wrong?

Comment 22 Lukas Slebodnik 2017-01-30 16:54:12 UTC
And I can still see problem with EACCESS

mkdir("/sys/fs/cgroup/pids/user.slice/user-20728.slice/user@20728.service/test3.scope", 0755) = -1 EACCES (Permission denied)

And strace output:
http://paste.fedoraproject.org/541176/14857952/

Comment 23 Zbigniew Jędrzejewski-Szmek 2017-02-01 06:27:51 UTC
I'm still convinced that this is related to cgroup hierarchy setup.
I cannot reproduce this with legacy cgroup hiearchy, and easily with the unified.

What does 'mount|grep sys' say?

Comment 24 Lukas Slebodnik 2017-02-01 08:30:16 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #23)
> I'm still convinced that this is related to cgroup hierarchy setup.
> I cannot reproduce this with legacy cgroup hiearchy, and easily with the
> unified.
>
I removed the kernel 4.10.0-0.rc6.git0.1.fc26.x86_64 and installed it one more time. It works in permissive mode after this ritual dance :-)

 
> What does 'mount|grep sys' say?

[root@vm-118 ~]# mount | grep sys
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime,seclabel)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,seclabel,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime,seclabel)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
configfs on /sys/kernel/config type configfs (rw,relatime)
selinuxfs on /sys/fs/selinux type selinuxfs (rw,relatime)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=39,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=13074)
debugfs on /sys/kernel/debug type debugfs (rw,relatime,seclabel)

Comment 25 Zbigniew Jędrzejewski-Szmek 2017-02-01 13:01:58 UTC
I'm glad the ritual sacrifice helped ;)


Note You need to log in before you can comment on or make changes to this bug.