openQA F29 -> Rawhide (F30) upgrade tests are often failing at present. The upgrade process hangs at the end, showing the message "Upgrade complete! Cleaning up and rebooting...", e.g.: https://openqa.fedoraproject.org/tests/350591#step/_graphical_wait_login/24 I have reproduced this manually in a VM with an upgrade from Fedora 29 x86_64 Server network install to current Rawhide. If you force a reboot at this point, the upgraded system appears to boot fine. If you then look at the logs from the upgrade boot, the end looks like this: Feb 08 12:13:11 localhost.localdomain dnf[721]: Complete! Feb 08 12:13:11 localhost.localdomain python3[721]: Upgrade complete! Cleaning up and rebooting... Feb 08 12:13:11 localhost.localdomain dnf[721]: Cleaning up downloaded data... Feb 08 12:13:11 localhost.localdomain audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=dnf-system-upgrade comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Feb 08 12:13:11 localhost.localdomain audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=dnf-system-upgrade comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Feb 08 12:13:11 localhost.localdomain systemd[1]: dnf-system-upgrade.service: Succeeded. Feb 08 12:13:11 localhost.localdomain systemd[1]: Started System Upgrade using DNF. Feb 08 12:13:11 localhost.localdomain systemd[1]: Reached target Offline System Update. Feb 08 12:13:11 localhost.localdomain systemd[1]: Condition check resulted in Remove the Offline System Updates symlink being skipped. Feb 08 12:13:11 localhost.localdomain systemd[1]: Startup finished in 1.731s (kernel) + 1.676s (initrd) + 5min 57.756s (userspace) = 6min 1.164s. Feb 08 12:13:36 localhost.localdomain systemd-logind[750]: Failed to get load state of reboot.target: Connection timed out Feb 08 12:13:36 localhost.localdomain dbus-daemon[748]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms) Not sure what the cause is. This appears to affect upgrades of Workstation, KDE and Server, but not upgrade of a minimal install.
I have exactly the same issue and behaviour. Fresh Fedora-Server-netinst-x86_64-29-1.2 install. Then I have done all steps in https://fedoraproject.org/wiki/QA:Testcase_upgrade_dnf_current_server Feb 09 11:10:46 localhost.localdomain dnf[767]: systemd-rpm-macros-240-6.gitf02b547.fc30.noarch Feb 09 11:10:46 localhost.localdomain dnf[767]: whois-nls-5.4.1-1.fc30.noarch Feb 09 11:10:46 localhost.localdomain dnf[767]: Fertig. Feb 09 11:10:46 localhost.localdomain python3[767]: Upgrade complete! Cleaning up and rebooting... Feb 09 11:10:46 localhost.localdomain dnf[767]: Cleaning up downloaded data... Feb 09 11:10:47 localhost.localdomain systemd[1]: dnf-system-upgrade.service: Succeeded. Feb 09 11:10:47 localhost.localdomain systemd[1]: Started System Upgrade using DNF. Feb 09 11:10:47 localhost.localdomain audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=dnf-system-upgrade comm="systemd" exe="/usr/lib/systemd/system$ Feb 09 11:10:47 localhost.localdomain audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=dnf-system-upgrade comm="systemd" exe="/usr/lib/systemd/systemd$ Feb 09 11:10:47 localhost.localdomain systemd[1]: Reached target Offline System Update. Feb 09 11:10:47 localhost.localdomain systemd[1]: Condition check resulted in Remove the Offline System Updates symlink being skipped. Feb 09 11:10:47 localhost.localdomain systemd[1]: Startup finished in 2.088s (kernel) + 2.047s (initrd) + 6min 57.670s (userspace) = 7min 1.807s. Feb 09 11:11:11 localhost.localdomain systemd-logind[773]: Failed to get load state of reboot.target: Connection timed out Feb 09 11:11:11 localhost.localdomain dbus-daemon[771]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
I'm gonna propose this as a Final blocker under Beta criterion "For each one of the release-blocking package sets, it must be possible to successfully complete a direct upgrade from a fully updated, clean default installation of each of the last two stable Fedora releases with that package set installed" - https://fedoraproject.org/wiki/Fedora_30_Beta_Release_Criteria#Upgrade_requirements . Reasoning: for Beta it's probably OK to tell people "oh if it gets stuck there just hit the reset button", but for Final that seems a bad story to be trying to sell.
Created attachment 1528791 [details] screenshot from installer with no text in "ad" I'm testing this now. BTW, is something off with the "ads" during installation? I see no text.
We have three mechanisms which should trigger reboot: 1. the upgrade service should do it after the installation is finished 2. after the update is done, system-update-cleanup.service will remove the symlink and reboot the machine 3. dnf-system-upgrade.service has an internal dnf-system-upgrade-cleanup.service which does an unconditional reboot Re 1: it seems dnf-system-upgrade fails to reboot, but still exits with success error code. So we have two bugs here: one is that the reboot request fails (no idea why yet), and the other is that the return code is wrong. The error is actually from systemd-logind, so it's possible that d-s-u makes the request correctly, and systemd-logind just fails to execute it. In that case, the return code from d-s-u might be OK. I see "dbus-daemon: ... Failed to activate service 'org.freedesktop.systemd1", which looks like dbus-daemon is trying to do activation on its own which would be very wrong. So the issue might be that something is starting dbus-daemon in a wrong mode. Maybe it's something to do with the handover from dbus-daemon to dbus-broker? Re 2: system-update-cleanup.service has ConditionPathExists=|/system-update, because it was mostly designed to prevent a reboot-into-system-update loop when the upgrade service is broken. We *could* remove this condition, but then system-update-cleanup.service could step on the toes of the upgrade service by scheduling a reboot. The upgrade service could for example schedule a shutdown not a reboot, or a reboot with some special parameters, so scheduling a reboot unconditionally from system-update-cleanup.service seems to limit some possible uses of system-update.target quite a bit. So I don't think we want to do this. Re 3: d-s-u-cleanup.service is only triggered if d-s-u.service fails, so it's not relevant here.
'systemctl reboot' also waits a long time, but then reboots. 'busctl call org.freedesktop.login1 /org/freedesktop/login1 org.freedesktop.login1.Manager Reboot "b" false' times out. 'systemctl restart systemd-logind' also times out, and there's a job active: systemd-logind.service/start. But 'systemctl status systemd-logind' shows Stutus: "Processing requests". This means that systemd received READY=1 from logind, but for some reason did not mark it as started. So right now this looks as a bug systemd itself.
Reassigning to systemd based on comment#5. Also the original report and comment#1 indicate that the problem is probably elsewhere than in DNF: Feb 08 12:13:36 localhost.localdomain systemd-logind[750]: Failed to get load state of reboot.target: Connection timed out Feb 08 12:13:36 localhost.localdomain dbus-daemon[748]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Discussed during the 2019-02-11 blocker review meeting: [1] The decision to classify this bug as an AcceptedBlocker was made: "accepted as violation of "For each one of the release-blocking package sets, it must be possible to successfully complete a direct upgrade from a fully updated, clean default installation of each of the last two stable Fedora releases with that package set installed", reasoning per comment #2" [1] https://meetbot-raw.fedoraproject.org/fedora-blocker-review/2019-02-11/f30-blocker-review.2019-02-11-17.13.log.txt
*** Bug 1680042 has been marked as a duplicate of this bug. ***
Last night upgraded system from f29 -> f30 and got bitten by this. Now system doesn't boot, gives only: [FAILED] Failed to start Switch Root.
That's almost certainly a different bug; no-one else has reported an actual system issue caused by this bug, it's purely a failure to reboot automatically.
At every restart it ends up into grub> and that's it.
that may possibly be https://bugzilla.redhat.com/show_bug.cgi?id=1661955 , although I think that was supposed to be fixed...
Yeah, Adam, I did get beyond that issue and I think it is fixed...unless Juha's found a different code path that leads to the same symptom. Juha, I thought that bug got fixed. But, that doesn't help you much. I'm now stuck at bug 1686059, which is closed, but I haven't had a chance to test the solution myself. But, that also doesn't help you. FYI, I now have two test VMs that I set up like my two main computers, just so that I can test upgrades *every week* because I have been burned so many times, recently, by upgrade problems. I've been using VM testing much longer for system-upgrades, though, and I highly recommend that. Again, that doesn't help you, but maybe next time?
(In reply to Paul DeStefano from comment #13) > Juha, I thought that bug got fixed. But, that doesn't help you much. > > I'm now stuck at bug 1686059, which is closed, but I haven't had a chance to > test the solution myself. But, that also doesn't help you. My symptoms are the same as people report there, grub.cfg has no Fedora/Linux entries, that's why it pops the shell open. Something went wrong in dnf system-upgrade final stage.
Juha, can you please check the versions of the relevant packages on each end of the upgrade, and comment on the bug if your upgrade used the versions which were supposed to be fixed but still failed? Thanks!
(In reply to Adam Williamson from comment #15) > Juha, can you please check the versions of the relevant packages I don't know which are the pkgs. > on each end of the upgrade, Each end? This is my workstation that I use daily basis. I needed to upgrade to test bug #1634658, that 'other end' is gone. > and comment on the bug if your upgrade used the versions > which were supposed to be fixed but still failed? Thanks! There were other issues too, kdm failed to start since sddm.kcm was not installed and systemd targets / runlevels were screwed somehow. Not sure how it's supposed to be but I got screen up for now.
*** Bug 1692617 has been marked as a duplicate of this bug. ***
*** Bug 1685455 has been marked as a duplicate of this bug. ***
It seems something goes wrong in the triangle dbus-daemon — dbus-broker — systemd. I'm upgrading those packages to F30 individually. c='sudo dnf upgrade --releasever=30 --setopt=module_platform_id=platform:f30 -y' First "$c systemd": no problem. Second "$c dbus-daemon": hangs! Just "$c dbus-daemon": also hangs. "$c fedora-repos": OK. "$c dbus-daemon": fails. "$c fedora-repos": OK "$c dbus-broker": fails My hypothesis at this point is that when dbus-broker changes dbus.{socket,service} to point to its units, systemd checks if dbus.{socket,service} are running, realizes that they are not, and closes the dbus connection.
Yep: # kill -SIGUSR1 1 # journalctl -f Apr 09 16:03:32 fedora systemd[1]: Received SIGUSR1 from PID 1138 (bash). Apr 09 16:03:32 fedora systemd[1]: Starting D-Bus service... Apr 09 16:03:32 fedora systemd[1]: Starting D-Bus System Message Bus... Apr 09 16:03:32 fedora dbus-broker-launch[2873]: Looking up NSS user entry for 'gdm'... Apr 09 16:03:42 fedora dbus-broker-launch[2873]: NSS returned no entry for 'gdm' Apr 09 16:03:42 fedora systemd[1]: Started D-Bus System Message Bus. Apr 09 16:03:42 fedora audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=dbus-broker comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Apr 09 16:03:42 fedora dbus-daemon[789]: [system] Successfully activated service 'org.freedesktop.systemd1' So systemd things dbus was not running, and then we have both dbus-daemon and dbus-broker active, and surprisingly, things seem to work again. I think we need to handle this switcheroo with dbus.{socket,service} better somehow in systemd.
dbus-broker-19-2.fc30 has been submitted as an update to Fedora 30. https://bodhi.fedoraproject.org/updates/FEDORA-2019-be8e006363
dbus-broker-19-2.fc30 has been pushed to the Fedora 30 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-be8e006363
dbus-broker-19-2.fc30 has been pushed to the Fedora 30 stable repository. If problems still persist, please make note of it in this bug report.