Description of problem:
I have installed a minimal F21 installation from Server netinst, then upgraded it using fedup:
# fedup --network 22 --instrepo http://dl.fedoraproject.org/pub/alt/stage/22_Beta_RC1/Server/x86_64/os/
At the very end of the upgrade, the system fails to reboot. It just waits. I have waited more then 10 minutes, still hanging. Hitting Ctrl+Alt+Delete reboots the system.
The only error message I see is:
> Failed unmounting /sysroot/proc.
I don't know if it's relevant.
Version-Release number of selected component (if applicable):
100% (3 of 3 attempts)
Steps to Reproduce:
1. install F21 minimal from Server netinst
2. run fedup with Beta RC1 Server x86_64 instrepo
3. see that system does not reboot after upgrade
I suspected this to be a duplicate of bug 1205344, but plymouth seems to be installed on F21 minimal system. So this is probably a different bug.
Created attachment 1012241 [details]
screenshot of fedup hanging and not rebooting
Created attachment 1012242 [details]
Created attachment 1012243 [details]
Created attachment 1012244 [details]
journal from the upgrade run
Created attachment 1012245 [details]
rpm -qa on F21 before upgrade
Created attachment 1012246 [details]
rpm -qa on F22 after upgrade
This seems to violate our criteria:
"For each one of the release-blocking package sets, it must be possible to successfully complete an upgrade from a fully updated installation of the previous stable Fedora release with that package set installed. The release-blocking package sets are the minimal set, and the sets for each one of the release-blocking desktops. The upgraded system must meet all release criteria. "
If the system doesn't reboot automatically, it makes upgrading servers and other remote machines very problematic. And that is exactly the common use case for minimal installs. OTOH, this might be too strict for Beta.
I have reproduced the same problem with F21 Server install, so this does not affect just minimal install.
Personally I think there may be a case for Final blocker here, but it doesn't seem serious enough for Beta. I really wouldn't expect people to be doing mass unattended upgrades to Beta. The point of requiring upgrades to basically work at Beta is so the mechanism can be tested in a range of cases; it does work well enough for that.
Tried fedup twice (workstation and minimal) on UEFI machine, 21->22:
Workstation fedup process went as expected.
Minimal fedup process updated the system, but before the final reboot, it hung on "Failed unmounting /sysroot/proc" as kparal described.
I also encountered this while testing bug 1208214 comment 22
(In reply to awilliam from comment #9)
> Personally I think there may be a case for Final blocker here, but it
> doesn't seem serious enough for Beta. I really wouldn't expect people to be
> doing mass unattended upgrades to Beta. The point of requiring upgrades to
> basically work at Beta is so the mechanism can be tested in a range of
> cases; it does work well enough for that.
I agree. I'm dropping Beta proposal and moving it to Final.
I have tried this with encrypted Workstation install and fedup rebooted just fine. This seems to affect only Server or minimal installations.
I have just reproduced this with an unencrypted Workstation upgrade. So, it seems this is probably some kind of a race condition (and maybe it's more frequent for Server/minimal, or we just had more of bad luck with it).
I hit this last night with a fedup of F21 with Plasma5 from dvratil-copr to F22 Plasma. If it is a hardware race condition, I have an Intel(R) Core(TM) i7-3610QM CPU @ 2.30GHz, 24GB RAM, and an encrypted 128GB SATAIII SSD and 1TB spinner. Since it was my laptop, the only log I could grab was a picture of the screen with my phone.
Seen couples of times on Mustang in the Linaro server lab.
I just saw this during my update to Fedora 22 Beta RC1.
Ran into this again Saturday night. The system in question was a Fedora 20 KDE install, fedup'd two weeks prior to F21 non-product KDE. Once fedup to F22 Plasma was finished, it stalled at "Failed to unmount /sysroot/proc" and "Failed to unmount /dev/hugepages".
It isn't just minimal or server installs.
Discussed at today's blocker review meeting .
This bug was accepted as Final Blocker - it's clear we don't have all the information here, but upgrade apparently fairly commonly failing to reboot is considered a conditional violation of the 'upgrade requirements' criterion, due to the remote system scenario cited in comment #7
I just tested this with fedup-0.9.2-1.fc21 when upgrading f21 Server. System wasn't rebooted after upgrade was finished. (system failed to unmount /sysroot/proc).
I received a very similar issue. I did not actually get an error, but the system did hang during the unmounting process. ctrl-alt-del let it continue with a couple messages I missed. Screenshot (from my phone) of lead in to hang.
Screenshot: http://imgur.com/6wX1Jwd (not seeing attachment button)
I had the very same experience, only from a Workstation 21 install. This was a real world machine with many packages installed. I used fedup-0.9.2-1.fc21 as per the Test Day:2015-04-21 FedUp wiki page. I ran into this bug at the end of the upgrade saying it could not unmount /systroot/proc but after ctrl-alt-del the system rebooted fine and I'm now on Workstation 22 Beta.
While following 'Test Day:2015-04-21 FedUp', I encountered the same behavior as Matthew's machine (comment #21). While not being too different from a standard Workstation, my Fedora is a 'Non-Product' one.
Same issue for me as occurred on Matthew's and Giulio's machine (comments #21 and #22), "solved" with a ctrl+alt+del.
Like Giulio, Fedora on my computer is a 'Non-Product' one (lxde+openbox+...).
If there is the need, I can provide the logs (fedup for sure because I saved it, and the upgrade log), but I think would not be much more useful than other attached above.
I encountered some warning of packages without updates before rebooting. I'll attach them.
Created attachment 1017543 [details]
log I saved by passing --debuglog to fedup.
Discussed at the 2015-04-28 blocker review meeting.
Will, has there been any news on this bug?
Discussed at the 2015-05-04 blocker review meeting.
We are still awaiting an update to this blocker with Final Freeze being 8 days away.
Reproduced this by starting with a clean Fedora 21 netinstall using standard partitions + XFS, nothing fancy.
Created attachment 1022395 [details]
journal systemd debug
Reproduced in a (virt-manager) VM, set systemd.log_level=debug and captured the console output to this file.
This could be a systemd bug since it's mainly responsible for reboot/shutdown, adding Zbigniew. Maybe a suggestion for getting more information than the serial console capture has?
Okay, so here's what's going on, according to that log.
We'll start where reboot gets initiated:
Executing: /usr/bin/systemctl --no-block isolate reboot.target
Calling manager for StartUnit on reboot.target, isolate
[...lots of shutdown stuff is scheduled, including these stop jobs...]
Installed new job systemd-journald-audit.socket/stop as 391
Installed new job systemd-journald.service/stop as 394
[...and later that job kills off systemd-journal...]
Child 767 (systemd-journal) died (code=exited, status=0/SUCCESS)
Child 767 belongs to systemd-journald.service
systemd-journald.service: main process exited, code=exited, status=0/SUCCESS
systemd-journald.service changed stop-sigterm -> dead
Job systemd-journald.service/stop finished, result=done
[ OK ] Stopped Journal Service.
Releasing all resources for systemd-journald.service
systemd-journald-dev-log.socket changed running -> listening
systemd-journald-audit.socket changed running -> listening
systemd-journald.socket changed running -> listening
[...huh, two other sockets? a moment later, we see:]
Cannot find unit for notify message of PID 767.
Got auxiliary fds with notification message, closing all.
Closing left-over fd 21
Cannot find unit for notify message of PID 767.
Received SIGCHLD from PID 767 (n/a).
[...I thought PID 767 was already dead? but whatever...]
Accepted new private connection.
Incoming traffic on systemd-journald-audit.socket
Suppressing connection request on systemd-journald-audit.socket since unit stop is scheduled.
[...that socket gets ignored because we're shutting it down, but...]
Incoming traffic on systemd-journald.socket
Trying to enqueue job systemd-journald.service/start/replace
Job shutdown.target/start finished, result=canceled
Installed new job shutdown.target/stop as 398
Installed new job systemd-journald.service/start as 395
Job reboot.target/start finished, result=canceled
And there you have it: reboot cancelled.
So the problem is that journald starts back up, and journald starts because systemd-journald.socket is still active.
So the question is: why isn't systemd-journald.socket being stopped?
APM Mustang (aarch64). Installed Fedora 21 Server and then used fedup to upgrade to F22 TC2. Had to reset machine by hand. After reset landed in properly installed F22.
Created attachment 1026552 [details]
fedup-dracut patch to call systemctl reboot
Please try this patch. I think it should do the trick.
I don't think that's going to work; we're already isolating reboot.target (see the top of the log in comment #30).
Oh, and here's why those two sockets aren't being shut down when we isolate reboot:
Which makes the new question: how does normal system shutdown avoid this?
Please look at the description of the patch (not shown if you just click on the "Created attachement ..." link, you need the "raw" version https://bugzilla.redhat.com/attachment.cgi?id=1026552). With the proposed change an different job-mode is used and the transition is irreversible.
Heh! Yep, I understand now - I just came to the same conclusion after debugging a normal reboot sequence, which does:
Trying to enqueue job reboot.target/start/replace-irreversibly
Trying to enqueue job reboot.target/start/isolate
which appears in the logs for the upgrade.
So yeah, the root cause here is that fedup-dracut is manually isolating reboot.target when it should just 'systemctl reboot ...' and let systemd worry about the rest.
Which means your patch is totally correct! I've applied it upstream:
There should be a fixed fedup-dracut build shortly. (Note that the F22 boot images will need to be rebuilt to pick up this fix.)
fedup-dracut-0.9.2-1.fc22 now building:
fedup-dracut-0.9.2-1.fc22 has been submitted as an update for Fedora 22.
* should fix your issue,
* was pushed to the Fedora 22 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing fedup-dracut-0.9.2-1.fc22'
as soon as you are able to.
Please go to the following url:
then log in and leave karma (feedback).
This seems to be fixed in RC1, system reboots properly after fedup.
fedup-dracut-0.9.2-1.fc22 has been pushed to the Fedora 22 stable repository. If problems still persist, please make note of it in this bug report.
*** Bug 1224985 has been marked as a duplicate of this bug. ***
Confirming the bug on fedup-0.9.2-1.fc21 (F21->F22 upgrade)
Germano, your screenshot does not contain "Failed unmounting /sysroot/proc." message, so it's likely to be a different bug. Please reopen you original bug and attach /var/log/upgrade/upgrade.log there. Thanks.
I'm not sure that this is the same bug, but here is the story. I upgraded a server (using nonproduct, because it was already carefully configured) remotely from F20 to F22. (I had done this before on my laptop, but not remotely.) I did the download part on June 7. At 6:30 AM on June 8 I rebooted the server remotely. Based on previous experience, I expected it to go through "System Upgrade" and take about 2 hours and then reboot by itself. At 11 AM it was still inaccessible so I went in to my office where the server is. The monitor (which had been turned off) showed nothing but a couple of lines. The keyboard and mouse did nothing. The ethernet light was flashing, but I could not even ping the server. And the fan was running at full blast. I figured, "Well, I'd better let it do its thing," so I let it run until 12:30. Finally I gave up and did a hard reboot (pressing the power button). It rebooted correctly into F22 and all was well.
I could find nothing that made any sense to me in the logs, but I save them in their most recent stat: fedup.log, dnf.log, and messages. I gave the times above so someone can track them in the messages file.
It may be that the instructions should say "Don't do this remotely." But my other server doesn't even have a monitor, although I could connect one.
no, there's usually no reason not to do it remotely. It's expected to work. Hard to tell what happened in your case without logs, though.
Created attachment 1044520 [details]
fedup.log in response to comment 45
Created attachment 1044521 [details]
dnf.log in response to comment 45
Created attachment 1044522 [details]
selection from /var/log/messages in response to comment 45