Bug 1568594

Summary: re-allow Delegate=yes in f28
Product: [Fedora] Fedora Reporter: Dusty Mabe <dustymabe>
Component: systemdAssignee: systemd-maint
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 28CC: awilliam, brett.wagner, dustymabe, jcajka, jiyin, jlebon, joe, jpazdziora, lnykryn, maszulik, mboddu, miabbott, msekleta, ssahani, s, systemd-maint, zbyszek
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: AcceptedFreezeException
Fixed In Version: systemd-238-7.fc28.1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-24 11:24:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On:    
Bug Blocks: 1469207    

Description Dusty Mabe 2018-04-17 21:11:06 UTC
Description of problem:

NOTE: I'm not an expert in systemd or runc/kube so this BZ is mostly going to be high level. I've already discussed with zbyszek and it was requested I open this bug to propose this change.



In systemd v238 `Delegate=yes` was disallowed (I think the upstream commit is [1]). This has caused issues in runc[2]/docker/kube [3][4] as those projects try to deal the change. These projects have not yet figured everything out and we are bearing down on F28 release so rather than release F28 Atomic Host that can't run kube/openshift it would be nice if we could revert this change in F28 and by the time F29 comes out everything would have settled.

See also: https://pagure.io/atomic-wg/issue/452


[1] https://github.com/systemd/systemd/commit/1d9cc8768f173b25757c01aa0d4c7be7cd7116bc
[2] https://github.com/opencontainers/runc/pull/1776
[3] https://github.com/kubernetes/kubernetes/pull/61926
[4] https://bugzilla.redhat.com/show_bug.cgi?id=1558425


Version-Release number of selected component (if applicable):
systemd-238-7.fc28.x86_64

How reproducible:


Steps to Reproduce:
1. oc cluster up (see https://pagure.io/atomic-wg/issue/452)
2.
3.

Comment 1 Fedora Blocker Bugs Application 2018-04-17 21:15:25 UTC
Proposed as a Freeze Exception for 28-final by Fedora user dustymabe using the blocker tracking app because:

 This is the opposite of 1558425. Instead of rushing in changes for runc/kube/docker we are proposing we revert the functionality in systemd.

Comment 2 Adam Williamson 2018-04-17 21:47:31 UTC
+1 FE for this, I'm a lot happier about delaying the change that caused the mess than trying to fix the mess in a hurry.

Comment 3 Patrick Uiterwijk 2018-04-17 22:21:52 UTC
+1 FE. Let's give people time to make sure their stuff works with the new upstream value...

Comment 4 Mohan Boddu 2018-04-17 22:38:43 UTC
+1 FE, its better to not fix things in a hurry and probably end up with more issues.

Comment 5 Adam Williamson 2018-04-17 22:52:34 UTC
That's +3, setting accepted.

Comment 6 Zbigniew Jędrzejewski-Szmek 2018-04-18 18:37:03 UTC
Only lightly tested: https://koji.fedoraproject.org/koji/taskinfo?taskID=26444542

Comment 7 Dusty Mabe 2018-04-18 18:41:15 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #6)
> Only lightly tested:
> https://koji.fedoraproject.org/koji/taskinfo?taskID=26444542

Thanks! Unfortunately looks like that build failed already.

Comment 8 Adam Williamson 2018-04-18 21:48:52 UTC
Is there some reason this appears to have been implemented as new code, rather than by simply reverting the commit that made this change in the first place? I think a simple reversion is what we were expecting when we accepted this as FE.

Comment 9 Zbigniew Jędrzejewski-Szmek 2018-04-18 22:11:07 UTC
Why did you expect a simple revert in particular? I never said anything about the details of the implementation... The patch that was added gives a warning which should help people notice that this is not supported.

(In case this wasn't clear, previous build failed because of changes in gpg headers, not related to changes in systemd.)

Anyway, a second build is at https://koji.fedoraproject.org/koji/taskinfo?taskID=26446409.

Comment 10 Dusty Mabe 2018-04-19 03:08:15 UTC
so I tried this out on an atomic host system by "override replacing" the existing systemd with the new systemd. I end up with a system that doesn't boot properly. Quite a few things don't work on the system but some immediate ones that jump out are:

```
# systemctl --failed
  UNIT                           LOAD   ACTIVE SUB    DESCRIPTION              
● systemd-logind.service         loaded failed failed Login Service            
● systemd-tmpfiles-setup.service loaded failed failed Create Volatile Files and
```


```
# journalctl -u systemd-logind.service | tee
-- Logs begin at Thu 2018-04-19 02:53:01 UTC, end at Thu 2018-04-19 02:55:09 UTC. --
Apr 19 02:53:03 vanilla-f28-atomic systemd[1]: Starting Login Service...
Apr 19 02:53:03 vanilla-f28-atomic systemd-logind[688]: Failed to connect to system bus: No such file or directory
Apr 19 02:53:03 vanilla-f28-atomic systemd-logind[688]: Failed to fully start up daemon: No such file or directory
Apr 19 02:53:03 vanilla-f28-atomic systemd[1]: systemd-logind.service: Main process exited, code=exited, status=1/FAILURE
Apr 19 02:53:03 vanilla-f28-atomic systemd[1]: systemd-logind.service: Failed with result 'exit-code'.
Apr 19 02:53:03 vanilla-f28-atomic systemd[1]: Failed to start Login Service.
Apr 19 02:53:03 vanilla-f28-atomic systemd[1]: systemd-logind.service: Service has no hold-off time, scheduling restart.
Apr 19 02:53:03 vanilla-f28-atomic systemd[1]: systemd-logind.service: Failed to schedule restart job: Unit dbus.socket not found.
Apr 19 02:53:03 vanilla-f28-atomic systemd[1]: systemd-logind.service: Failed with result 'exit-code'.
```

```
# systemctl status systemd-tmpfiles-setup.service | tee
● systemd-tmpfiles-setup.service - Create Volatile Files and Directories
   Loaded: loaded (/usr/lib/systemd/system/systemd-tmpfiles-setup.service; static; vendor preset: enabled)
   Active: failed (Result: exit-code) since Thu 2018-04-19 02:53:03 UTC; 5min ago
     Docs: man:tmpfiles.d(5)
           man:systemd-tmpfiles(8)
  Process: 677 ExecStart=/usr/bin/systemd-tmpfiles --create --remove --boot --exclude-prefix=/dev (code=exited, status=1/FAILURE)
 Main PID: 677 (code=exited, status=1/FAILURE)

Apr 19 02:53:03 vanilla-f28-atomic systemd[1]: Starting Create Volatile Files and Directories...
Apr 19 02:53:03 vanilla-f28-atomic systemd-tmpfiles[677]: "/home" already exists and is not a directory.
Apr 19 02:53:03 vanilla-f28-atomic systemd-tmpfiles[677]: "/srv" already exists and is not a directory.
Apr 19 02:53:03 vanilla-f28-atomic systemd-tmpfiles[677]: "/tmp" already exists and is not a directory.
Apr 19 02:53:03 vanilla-f28-atomic systemd-tmpfiles[677]: Unable to fix SELinux security context of /sysroot/tmp/.X11-unix: Read-only file system
Apr 19 02:53:03 vanilla-f28-atomic systemd-tmpfiles[677]: Unable to fix SELinux security context of /sysroot/tmp/.ICE-unix: Read-only file system
Apr 19 02:53:03 vanilla-f28-atomic systemd-tmpfiles[677]: Unable to fix SELinux security context of /sysroot/tmp/.font-unix: Read-only file system
Apr 19 02:53:03 vanilla-f28-atomic systemd[1]: systemd-tmpfiles-setup.service: Main process exited, code=exited, status=1/FAILURE
Apr 19 02:53:03 vanilla-f28-atomic systemd[1]: systemd-tmpfiles-setup.service: Failed with result 'exit-code'.
Apr 19 02:53:03 vanilla-f28-atomic systemd[1]: Failed to start Create Volatile Files and Directories.
```

somehow /tmp/ is mounted read-only.. will have to investigate more tomorrow

Comment 11 Zbigniew Jędrzejewski-Szmek 2018-04-19 08:51:30 UTC
OK, we have a problem. systemd-238-7 rebuilt current F28 root is not bootable in atomic host

Steps:
- take https://download.fedoraproject.org/pub/fedora/linux/releases/test/28_Beta/AtomicHost/x86_64/images/Fedora-AtomicHost-28_Beta-1.3.x86_64.qcow2
- 'sudo dnf override replace' with systemd-238-7 from koji
- reboot
→ OK

- start over with Fedora-AtomicHost-28_Beta-1.3.x86_64.qcow2
- rebuild systemd-238-7 (i.e. the version that was in F28 before final freeze) in current F28 mock with a patch for the gpg headers miscompilation
- 'sudo dnf override replace'
→ does not boot

But the same rpms seem to boot fine on a traditional installation. Something strange is going on here.

Comment 12 Zbigniew Jędrzejewski-Szmek 2018-04-19 10:32:03 UTC
So the issue is that rpm override replace makes the machine unbootable. I don't think it's related to the systemd build all. Above, when I installed systemd-238-7, rpm-ostree was smart enough to notice that it's the same version, and didn't do anything. Hence, the next boot was successful. To make the boot fail it's necessary to install a differing version:

[fedora@atomic-host tmp]$ sudo rpm-ostree override replace ./*rpm
Checking out tree 37ea5be... done
Inactive base replacements:
  systemd-238-7.fc28.x86_64
  systemd-udev-238-7.fc28.x86_64
  systemd-container-238-7.fc28.x86_64
  systemd-pam-238-7.fc28.x86_64
  systemd-libs-238-7.fc28.x86_64
Copying /etc changes: 20 modified, 0 removed, 51 added
Transaction complete; bootconfig swap: yes deployment count change: 0
Run "systemctl reboot" to start a reboot
→ no effect

[fedora@atomic-host tmp]$ sudo rpm-ostree override replace ./*rpm
...
Writing OSTree commit... done
Copying /etc changes: 20 modified, 0 removed, 53 added
Transaction complete; bootconfig swap: no deployment count change: 0
Upgraded:
  systemd 238-7.fc28 -> 238-7.fc28.2
  systemd-container 238-7.fc28 -> 238-7.fc28.2
  systemd-libs 238-7.fc28 -> 238-7.fc28.2
  systemd-pam 238-7.fc28 -> 238-7.fc28.2
  systemd-udev 238-7.fc28 -> 238-7.fc28.2
Run "systemctl reboot" to start a reboot
→ busted

"busted" means that all units are gone from /usr/lib/systemd apart from those that are part of systemd itself. There's no ostree-prepare-root.service, no ostree-remountfs.service, and we end up with /sysroot that is readonly, hence /tmp is read-only, and then things go south. E.g. we get the errors from systemd-tmpfiles-setup.service that were shown above.

Comment 13 Dusty Mabe 2018-04-19 12:28:16 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #12)
> So the issue is that rpm override replace makes the machine unbootable. I
> don't think it's related to the systemd build all. 

so what I read from this is that the specific test we are doing (override replace) is triggering a bug in rpm-ostree itself. What we really need to do to verify this fix is to rebuild an ostree from scratch and test that.

I'll build an ostree from scratch (including the build you provided us) and verify it works.

Thanks Zbigniew

Comment 14 Dusty Mabe 2018-04-19 13:52:32 UTC
ok I built an ostree from scratch and the test looks good. I was able to `oc cluster up` with openshift origin 3.9 and everything worked great. The steps I took were:

- boot F28 AH
- download oc tool from https://github.com/openshift/origin/releases/tag/v3.9.0
- run oc cluster up with 3.9 (verify things are broken)
- # sudo ostree remote add dusty https://dustymabe.fedorapeople.org/repo/ --no-gpg-verify
- # sudo rpm-ostree rebase dusty:
- reboot
- `oc cluster up` (verify it works this time)

Can other people verify this works for them?

Comment 15 Fedora Update System 2018-04-19 14:16:02 UTC
systemd-238-7.fc28.1 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-4154c1ee31

Comment 16 Micah Abbott 2018-04-19 14:17:41 UTC
I can confirm that I was able to use 'oc cluster up' to get an OpenShift Origin v3.9 cluster running using his custom tree.

Seems to indicate that the new 'systemd' package has fixed the original issue.

Comment 17 Jonathan Lebon 2018-04-19 14:38:35 UTC
It's too late for this, but in the face of `override replace` being broken, you should still be able to drop down to `ostree admin unlock --hotfix` and use `rpm -Uvh` instead if it's just for testing. Between that and composing a new tree though, clearly the latter is better.

Comment 18 Fedora Update System 2018-04-20 01:49:31 UTC
systemd-238-7.fc28.1 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-4154c1ee31

Comment 19 Zbigniew Jędrzejewski-Szmek 2018-04-20 09:40:26 UTC
*** Bug 1569906 has been marked as a duplicate of this bug. ***

Comment 20 Fedora Update System 2018-04-24 11:24:33 UTC
systemd-238-7.fc28.1 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report.

Comment 21 Maciej Szulik 2018-06-04 17:09:09 UTC
It looks like the problem is back. I'm running latest F28 systemd (238-8.git0e0aa59.fc28) and latest kubernetes (https://github.com/kubernetes/kubernetes/commit/1635393bd1cb21e82fbc899ce7fd6a997194abe2) and the problem persists.

Comment 22 Dusty Mabe 2018-06-04 17:49:11 UTC
(In reply to Maciej Szulik from comment #21)
> It looks like the problem is back. I'm running latest F28 systemd
> (238-8.git0e0aa59.fc28) and latest kubernetes
> (https://github.com/kubernetes/kubernetes/commit/
> 1635393bd1cb21e82fbc899ce7fd6a997194abe2) and the problem persists.

I'm currently running an openshift 3.9.0 cluster on latest Fedora Atomic Host on systemd-238-8.git0e0aa59.fc28.x86_64. Seems to be working fine. I wonder if upstream kube has made some changes and now it assumes your systemd doesn't allow Delegate=yes.

What version of kube are you running? Do you have the same problem with openshift 3.9 (see simple steps to reproduce in bug description above)

Comment 23 Zbigniew Jędrzejewski-Szmek 2018-06-04 20:50:39 UTC
@Maciej: systemd in both rawhide and F28 should allow (and ignore, apart from a warning), Delegate=yes. Aside from the warning, this is the same as before. So when you say "the problem persists", you have to be precise than that.

Comment 24 Maciej Szulik 2018-06-05 07:21:55 UTC
Yeah, sorry for the trouble it looks like the problem is with docker and not systemd. See https://github.com/kubernetes/kubernetes/issues/61474 and https://bugzilla.redhat.com/show_bug.cgi?id=1584909 for details of the current problem. Sorry for the noise.

Comment 25 Brett Wagner 2019-06-10 15:18:22 UTC
Just as a datapoint outside the redhat universe proper. I am seeing this on both docker and crio nodes when using the systemd cgroup driver against containeros 2079.5.1 w/ systemd v241.  

On the docker nodes we're currently on Docker 17.03 with whatever version of runc it bundles.

On cri-o nodes were running 1.14.2 with runc 1.0.0-rc8.

Both sets of nodes are attached to the same cluster which is running Kubernetes 1.14.2.

All the nodes initially seem to come up fine and run fine for a few days and then everything gets stuck in container creating with "Delegation not available for unit type".