Description of problem: I tried to create scope in user session; so I can kill it in case of leftover process. (That's part of development on my custom project) There are some SELinux denials BZ1413047, but that can be "solved" with booting in permissive mode. However even in permissive mode systemd-run failed for me. Version-Release number of selected component (if applicable): systemd-232-6.fc26.x86_64 How reproducible: Deterministic Steps to Reproduce: 1. #boot machine in permissive mode to workaround BZ1413047 2. #login as ordinary user 3. systemd-run --user --scope --unit=test-build -- bash Actual results: Job for test-build.scope failed. See "systemctl status test-build.scope" and "journalctl -xe" for details. Expected results: bash is executed Additional info: sh$ systemctl status --user test-build.scope ● test-build.scope - /usr/bin/bash Loaded: loaded (/run/user/1000/systemd/transient/test-build.scope; transient; vendor preset: enabled) Transient: yes Active: failed (Result: resources) Jan 13 15:50:44 host.example.com systemd[1748]: test-build.scope: Failed to add PIDs to scope's control group: Permission denied Jan 13 15:50:44 host.example.com systemd[1748]: Failed to start /usr/bin/bash. Jan 13 15:50:44 host.example.com systemd[1748]: test-build.scope: Unit entered failed state.
I do not have a problem (even in enforcing mode with older kernel 4.8.15-300.fc25.x86_64 The problem is only with 4.8.16-300.fc25.x86_64 Yes; kernel is from f25 and not from rawhide
and the same problem is with rawhide kernel sh$ systemd-run --user --scope --unit=test-build -- bash Job for test-build.scope failed. See "systemctl status test-build.scope" and "journalctl -xe" for details. sh$ uname -a Linux vm-118.idm.lab.eng.brq.redhat.com 4.10.0-0.rc3.git1.1.fc26.x86_64 #1 SMP Tue Jan 10 15:32:37 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux sh$ rpm -q systemd systemd-232-6.fc26.x86_64
> Uailed to add PIDs to scope's control group: Permission denied Unfortunately I cannot reproduce this. Can you try the following: - in one terminal window: sudo strace -f -p$(systemctl --value -p MainPID show user@1000) -s100 - in another run the systemd-run command.
I can reproduce it on latest rawhide with SELinux in permissive mode (since boot) on two machines (laptop and VM). And as I mention in previous comment. My current workaround is to use older kernel from f25. https://paste.fedoraproject.org/529144/87311148/ And related part of journal log Jan 17 22:08:21 vm-119.example.com polkitd[691]: Registered Authentication Agent for unix-process:1714:146520 (system bus name :1.20 [/usr/bin/pkttyagent --notify-fd 5 --fallback], object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_US.utf8) Jan 17 22:08:21 vm-119.example.com systemd[1178]: test-build3.scope: Failed to add PIDs to scope's control group: Permission denied Jan 17 22:08:21 vm-119.example.com systemd[1178]: Failed to start /usr/bin/bash. Jan 17 22:08:21 vm-119.example.com systemd[1178]: test-build3.scope: Unit entered failed state. Jan 17 22:08:21 vm-119.example.com polkitd[691]: Unregistered Authentication Agent for unix-process:1714:146520 (system bus name :1.20, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_US.utf8) (disconnected from bus) And thank you very much for looking into this bug.
1178 open("/sys/fs/cgroup/systemd/user.slice/user-20728.slice/user/test-build3.scope/cgroup.procs", O_WRONLY|O_NOCTTY|O_CLOEXEC) = 25 1178 write(25, "1714\n", 5) = -1 EACCES (Permission denied) Looks like selinux issue again. Just to verify: you are not running with unified cgroup hierarchy? Can you paste your /proc/cmdline?
(In reply to Zbigniew Jędrzejewski-Szmek from comment #5) > 1178 > open("/sys/fs/cgroup/systemd/user.slice/user-20728.slice/user/ > test-build3.scope/cgroup.procs", O_WRONLY|O_NOCTTY|O_CLOEXEC) = 25 > 1178 write(25, "1714\n", 5) = -1 EACCES (Permission denied) > > Looks like selinux issue again. > But it is in permissive mode. > Just to verify: you are not running with unified cgroup hierarchy? Can you > paste your /proc/cmdline? [root@vm-119 ~]# namei -mo /sys/fs/cgroup/systemd/user.slice/user-20728.slice/user/test-build3.scope/cgroup.procs f: /sys/fs/cgroup/systemd/user.slice/user-20728.slice/user/test-build3.scope/cgroup.procs dr-xr-xr-x root root / dr-xr-xr-x root root sys drwxr-xr-x root root fs drwxr-xr-x root root cgroup dr-xr-xr-x root root systemd drwxr-xr-x root root user.slice drwxr-xr-x root root user-20728.slice drwxr-xr-x lslebodn dev-lab user test-build3.scope - No such file or directory [root@vm-119 ~]# cat /proc/cmdline BOOT_IMAGE=/vmlinuz-4.10.0-0.rc3.git4.1.fc26.x86_64 root=/dev/mapper/vg_root-lv_root ro rd.lvm.lv=vg_root/lv_root rd.lvm.lv=vg_root/lv_swap rhgb quiet LANG=en_US.UTF-8 dhcpclass=lab-vms
And FYI, audit log did not contain any new messages either.
(In reply to Lukas Slebodnik from comment #6) > (In reply to Zbigniew Jędrzejewski-Szmek from comment #5) > > 1178 > > open("/sys/fs/cgroup/systemd/user.slice/user-20728.slice/user/ > > test-build3.scope/cgroup.procs", O_WRONLY|O_NOCTTY|O_CLOEXEC) = 25 > > 1178 write(25, "1714\n", 5) = -1 EACCES (Permission denied) > > > > Looks like selinux issue again. > > > But it is in permissive mode. > > > Just to verify: you are not running with unified cgroup hierarchy? Can you > > paste your /proc/cmdline? > > [root@vm-119 ~]# namei -mo > /sys/fs/cgroup/systemd/user.slice/user-20728.slice/user/test- > build3.scope/cgroup.procs > f: > /sys/fs/cgroup/systemd/user.slice/user-20728.slice/user/test- > build3.scope/cgroup.procs > dr-xr-xr-x root root / > dr-xr-xr-x root root sys > drwxr-xr-x root root fs > drwxr-xr-x root root cgroup > dr-xr-xr-x root root systemd > drwxr-xr-x root root user.slice > drwxr-xr-x root root user-20728.slice > drwxr-xr-x lslebodn dev-lab user > test-build3.scope - No such file or > directory > I broke the machine due to some testing. There fore I created new one. And there are wrong permissions on user.service. sh-4.4$ uname -a Linux vm-140-014.idm.lab.eng.brq.redhat.com 4.10.0-0.rc4.git1.1.fc26.x86_64 #1 SMP Tue Jan 17 22:50:08 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux sh-4.4$ rpm -q systemd systemd-232-7.fc26.x86_64 sh-4.4$ systemd-run --user --scope --unit=test -- bash Job for test.scope failed. See "systemctl status test.scope" and "journalctl -xe" for details. sh-4.4$ namei -mo /sys/fs/cgroup/pids/user.slice/user-20728.slice/user/test.scope f: /sys/fs/cgroup/pids/user.slice/user-20728.slice/user/test1.scope dr-xr-xr-x root root / dr-xr-xr-x root root sys drwxr-xr-x root root fs drwxr-xr-x root root cgroup dr-xr-xr-x root root pids drwxr-xr-x root root user.slice drwxr-xr-x root root user-20728.slice drwxr-xr-x root root user test.scope - No such file or directory
(In reply to Lukas Slebodnik from comment #6) > But it is in permissive mode. Of course. Too many bugs open ;) (In reply to Lukas Slebodnik from comment #6) > I broke the machine due to some testing. There fore I created new one. > And there are wrong permissions on user.service. Right, but that's on the 'pids' controller. Permissions on the 'systemd' "controller" are probably OK. I have the same permissions (root:root) on the /sys/fs/cgroup/pids/user.slice/user-*.slice/user@*.service. And in comment #c5 it is failing on /sys/fs/cgroup/systemd/user.slice/user-20728.slice/user/test-build3.scope/cgroup.procs, which is the 'systemd' controller. https://www.kernel.org/doc/Documentation/cgroup-v1/cgroups.txt says: > Note: Due to some restrictions enforced by some cgroup subsystems, moving > a process to another cgroup can fail. Do you have any non-default cgroup configuration (e.g. controllers, or systemd cgroup-related settings)?
(In reply to Zbigniew Jędrzejewski-Szmek from comment #9) > https://www.kernel.org/doc/Documentation/cgroup-v1/cgroups.txt > says: > > Note: Due to some restrictions enforced by some cgroup subsystems, moving > > a process to another cgroup can fail. > > Do you have any non-default cgroup configuration (e.g. controllers, or > systemd cgroup-related settings)? Everything(cgroup related) is handled by systemd. I didn't configure anything special. Do you have an idea why it works with older kernel? 4.8.15-300.fc25.x86_64
Yeah. In the meantime I spoke with Lennart and he says that there was some kernel change about the rules who can write that attribute when, then was reverted later on. He didn't remember any details, so I'll have to track this down.
Zbigniew, do you have an idea why you cannot reproduce bug yourself?
No. Please don't use needinfo for mundane questions.
I would not use needinfo if there was a willing to debug/troubleshoot this bug. Or did I miss some request for data?
Sorry for the delay. All that I currently know is stated in comment #11.
(In reply to Zbigniew Jędrzejewski-Szmek from comment #15) > Sorry for the delay. All that I currently know is stated in comment #11. Sure but does not explain why it works for you.
There's nothing in the list of patches 4.8.15..4.8.16 that could explain the difference. There's also no change in the package in Fedora, apart from the update of upstream version. Another observation is that for any given boot, this either works, or doesn't work, repeatably. So it doesn't seem to be a race condition at the time of scope creation. systemd-229-16.fc24.x86_64 4.8.15-200.fc24.x86_64 → OK systemd-232-7.fc26.x86_64 kernel-core-4.10.0-0.rc3.git4.1.fc26.x86_64 → OK systemd-232-7.fc26.x86_64 kernel-core-4.8.16-300.fc25.x86_64 → broken systemd-232-7.fc26.x86_64 kernel-core-4.8.15-300.fc25.x86_64 → broken systemd-232-10.fc26.x86_64 kernel-core-4.10.0-0.rc3.git4.1.fc26.x86_64 → OK, OK systemd-232-10.fc26.x86_64 kernel-core-4.10.0-0.rc5.git0.1.fc26.x86_64 → broken, broken, broken hm, good: $ cat /proc/4380/cgroup|sort -g 1:name=systemd:/user.slice/user-1001.slice/user/init.scope 2:devices:/user.slice 3:cpuset:/ 4:blkio:/user.slice 5:net_cls,net_prio:/ 6:hugetlb:/ 7:pids:/user.slice/user-1001.slice/user 8:memory:/user.slice 9:perf_event:/ 10:cpu,cpuacct:/user.slice 11:freezer:/ bad: $ cat /proc/1146/cgroup |sort -g 0::/user.slice/user-1000.slice/user/init.scope 1:cpu,cpuacct:/user.slice/user-1000.slice 2:freezer:/ 3:pids:/user.slice/user-1000.slice/user 4:memory:/user.slice/user-1000.slice 5:perf_event:/ 6:blkio:/user.slice/user-1000.slice 7:net_cls,net_prio:/ 8:cpuset:/ 9:devices:/user.slice/user-1000.slice 10:hugetlb:/ Oh, this depends on systemd version in the initramfs. current git → OK systemd-232-7, -10 → bad Hence, it's not really kernel-version-dependent.
I think it's the interaction between the unified cgroup hierarchy used for the systemd tree, i.e. the so called hybrid mode. With systemd-232 we defaulted to it, then it got reverted [https://github.com/systemd/systemd/commit/843d5baf6a ]. I'll rebuild with that commit. Let's see if that fixes your issue.
Feel free to provide just a scratch build. And I assume I need to regenerate initramfs with dracut.
Should be fixed in rawhide with systemd-232-11.fc26, please test.
sh# koji download-build --arch=x86_64 systemd-232-11.fc26 sh# dnf update *.rpm sh# dracut --regenerate-all sh# reboot Then login as ordinary user: [lslebodn@vm-118]$ systemd-run --user --scope --unit=test -- bash Job for test.scope failed. See "systemctl status test.scope" and "journalctl -xe" for details. [lslebodn@vm-118]$ getenforce Permissive Did I something wrong?
And I can still see problem with EACCESS mkdir("/sys/fs/cgroup/pids/user.slice/user-20728.slice/user/test3.scope", 0755) = -1 EACCES (Permission denied) And strace output: http://paste.fedoraproject.org/541176/14857952/
I'm still convinced that this is related to cgroup hierarchy setup. I cannot reproduce this with legacy cgroup hiearchy, and easily with the unified. What does 'mount|grep sys' say?
(In reply to Zbigniew Jędrzejewski-Szmek from comment #23) > I'm still convinced that this is related to cgroup hierarchy setup. > I cannot reproduce this with legacy cgroup hiearchy, and easily with the > unified. > I removed the kernel 4.10.0-0.rc6.git0.1.fc26.x86_64 and installed it one more time. It works in permissive mode after this ritual dance :-) > What does 'mount|grep sys' say? [root@vm-118 ~]# mount | grep sys sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime,seclabel) securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,seclabel,mode=755) cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd) pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime,seclabel) cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb) cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) configfs on /sys/kernel/config type configfs (rw,relatime) selinuxfs on /sys/fs/selinux type selinuxfs (rw,relatime) systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=39,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=13074) debugfs on /sys/kernel/debug type debugfs (rw,relatime,seclabel)
I'm glad the ritual sacrifice helped ;)