Description of problem:
I discovered that certain plymouth themes don't play well with dnf-plugin-system-upgrade. On my personal computer, I use 'spinner' theme. After rebooting into the upgrade environment, I was greeted by a completely black screen. I waited a few minutes, still black. No key press worked (esc, or arrows did nothing, did not switch me to text mode), so I almost convinced that my computer froze during bootup. I have no HDD LED and a silent SSD, so there was no way to tell whether something is happening in the background. I was this close (<-->) to hard-rebooting my computer, when I realized I could try to switch to TTY2. It worked, I logged in as root, and using journalctl -f I found out that already 100 packages have been upgraded. If I had hard-rebooted my computer, I would have had a complete mess in the packaging system and I would have probably needed to reinstall the system from scratch. Fortunately, I didn't, and my system was safe.
I have tested all other plymouth themes in Fedora repositories (plymouth-theme-*). I have found no other theme with such serious issues as the 'spinner' theme.
One further theme is broken - 'script'. It prints upgrade progress, but instead of redrawing the initial line as all other themes do, it prints them on new lines. After first 39 upgraded packages you no longer see any further progress (the output does not scroll down). Fortunately you can at least press Esc or an arrow key to switch into text mode and see the progress there. Still, some people might get confused by this halt of progress indication and might be scared to press any key.
All other themes seem to behave correctly and indicate progress correctly:
So to recap, there's a slight problem with 'script' theme and a serious problem with 'spinner' theme (black screen which does not react to key presses, looks dead).
I don't know whether this is something that dnf-plugin-system-upgrade should handle (for example use a hardcoded theme which is known to work well, if this is a possible solution), or whether this needs to be dealt with in particular themes. In that case I can file this report against appropriate packages.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. sudo dnf install plymouth-theme-spinner
2. sudo plymouth-set-default-theme spinner
3. sudo dracut -f
4. # reboot to make sure you see spinner
5. sudo dnf system-upgrade download --releasever=23
6. # snapshot your VM here to easily try other themes
7. sudo dnf system-upgrade reboot
upgrade progress not shown and machine seemingly dead if you get unlucky and have "unsupported" theme installed
see upgrade progress
Created attachment 1079090 [details]
upgrade screenshot with 'spinner' theme
Just black screen, no key works (but switching to TTY2 does).
Created attachment 1079091 [details]
upgrade screenshot with 'script' theme
The progress is no longer visible after first 39 packages upgraded. Esc or arrows fortunately work and you can see progress properly in text mode.
I doubt there is anything that the upgrade plugin should be doing differently here. After all it just pushed messages to be displayed...
I'm proposing this for final blocker discussion. It's a conditional violation of the upgrade criterion. If people see just a black screen and the PC seems stuck, they are not likely to wait 1-2 hours but will power cycle the computer instead, losing data. I have almost destroyed my system myself. This affects only a fraction of Fedora user base, though - those who use the spinner plymouth theme (or the script theme, which is less affected). It's hard to judge how many users have it. By my opinion it's the best looking theme of all offered by Fedora, so if people have custom theme, they are quite likely to have this one, but that's no real data.
This doesn't need to be completely *fixed*, IMO. It would be safe enough if dnf-plugin-system-upgrade contained a list of known-to-be-broken themes and warned the user or downright refused to continue if the current plymouth theme was one of those broken ones (you can trivially learn the current theme by running "plymouth-set-default-theme" without arguments).
(In reply to Zbigniew Jędrzejewski-Szmek from comment #3)
> I doubt there is anything that the upgrade plugin should be doing
> differently here. After all it just pushed messages to be displayed...
Yes, most probably. But in case this is not fixed in the themes/plymouth in time, I think there are possible workarounds. One of them is in comment 0, the second one is above.
Created attachment 1079095 [details]
journal for upgrade with spinner
This is full journal for upgrade that happened with 'spinner' theme (it had a black screen for the whole time). There are some interesting parts:
Oct 01 14:48:33 localhost.localdomain systemd: plymouth-start.service: main process exited, code=dumped, status=11/SEGV
Oct 01 14:48:33 localhost.localdomain systemd: Unit plymouth-start.service entered failed state.
Oct 01 14:48:33 localhost.localdomain systemd: plymouth-start.service failed.
Oct 01 14:48:34 localhost.localdomain systemd-coredump: Process 289 (plymouthd) of user 0 dumped core.
Stack trace of thread 289:
#0 0x00007f29a57beb11 ply_progress_animation_load (libply-splash-graphics.so.2)
#1 0x00007f29a59c78e2 show_splash_screen (two-step.so)
#2 0x00007f29a71ec4d0 ply_boot_splash_show (libply-splash-core.so.2)
#3 0x0000000000409146 on_change_mode (plymouthd)
#4 0x000000000040631c ply_boot_connection_on_request (plymouthd)
#5 0x00007f29a73f9796 ply_event_loop_process_pending_events (libply.so.2)
#6 0x00007f29a73f9fb0 ply_event_loop_run (libply.so.2)
#7 0x0000000000404ba4 main (plymouthd)
#8 0x00007f29a672c700 __libc_start_main (libc.so.6)
#9 0x0000000000405a19 _start (plymouthd)
so previously i believe fedup forced a specific theme, designed for the upgrade i think. maybe we should do that again (i mean we should probably fix the themes too of course)
The upgrade process no longer builds a new initramfs, because messing with the user's boot images is *very* delicate work and it's much more reliable if we don't.
So since we're not forcing a specific theme anymore.. is there any way to make plymouth switch themes after the switch-root? Or could it change themes when we do "change-mode --updates"?
does the new system use its own grub entry ? if so you could probably just override the splash with plymouth.override-splash right?
if not, we'll have to figure something out, though, what we figure out will have to go in the unupdated OS I guess.
Nope, no custom grub entry.
so this is tricky because even if we push plymouth updates to f21 and f22, there's no guarantee the system initrd that gets picked will have the updated plymouth. will have to think about this more...
Can we force initrd regeneration in plymouth post-install script? Alternatively a new version dnf-plugin-system-upgrade can contain it and it can also require the updated plymouth version. Not sure how clean this is.
well, we've historically avoided rebuilding the initrd from plymouth installs, and relied instead on it getting rebuilt when a kernel is installed. Of course it will only rebuild it for the running kernel, not any other kernels installed, so rebuilding the initrd isn't a full proof solution.
And adding a requirement on the new plymouth won't help if the new plymouth isn't in the initrd of course.
Maybe it's the best we can do though
Discussed at 2015-10-05 blocker review meeting: https://meetbot-raw.fedoraproject.org/fedora-blocker-review/2015-10-05/f23-blocker-review.2015-10-05-16.00.html . We agreed that as this doesn't actually break upgrades and only affects non-standard configurations it's not really severe enough to qualify as a blocker, but it would be very good if it could be fixed, and it's accepted as a freeze exception issue (if changes are needed to the frozen package set to fix it).
So, here's the good news:
I tested Ray's experimental plymouth build http://koji.fedoraproject.org/koji/taskinfo?taskID=11585152 and everything works great (tested charge, spinner and script themes). The problem is fixed. Great job, Ray.
Here's the bad news:
It does work great only if you rebuild initrd beforehand, i.e. run dracut -f or (I assume) install a new kernel. Otherwise the problems persist (however, with the spinner theme, the computer rebooted immediately after plymouth crash and didn't continue upgrading with a black screen, which is somewhat better result, but I'm not sure whether this behavior is not random). We would need to trigger initrd rebuild after plymouth update or do some other magic.
Also, it would be nice to get this update also for F21 (I haven't tested, but assume it's affected the same way).
Will, any chance dnf can rebuild the initrd ?
We've explicitly avoided rebuilding the initrd on plymouth upgrades in the past, since rebuilding the initrd can have other side effects. Even if we started doing it now in rawhide, doing it now in f21 and f22 seems a little discourteous to users who aren't upgrading.
As a side note, I looked into F22 and F21 and it seems there has been no plymouth updates in their whole life cycles (since GA). So even if we rebuild initrd on plymouth update, I don't think it's a problem, because it almost never happens anyway. The side effects might be a bigger concern, but I'm not familiar with the details.
it just seems like a big change to be throwing in to an old release. I'll do it if we can't do it from the upgrade process though.
Adding needinfo to Will to make sure he notices this.
Ray, can you please push a new update into Bodhi right away? Even without rebuilding initrd that patch is still very valuable and definitely better than the current state. Thanks!
(In reply to Ray Strode [halfline] from comment #15)
> Will, any chance dnf can rebuild the initrd ?
Pretty much anything is technically feasible. Whether it's a good idea or not is a totally different question.
> We've explicitly avoided rebuilding the initrd on plymouth upgrades in the
> past, since rebuilding the initrd can have other side effects.
dnf-plugin-system-upgrade explicitly avoids rebuilding initramfs or otherwise touching the boot configuration for similar reasons - it was a *serious* problem with the old fedup upgrade scheme.
So... is this really necessary, and is this really the best/only place to do it?
> Even if we
> started doing it now in rawhide, doing it now in f21 and f22 seems a little
> discourteous to users who aren't upgrading.
I tend to agree - it makes sense to make this happen as part of the upgrade process, but I don't want to add special-case code for every one-off problem that needs fixing in each release.
So. Maybe we need to work out some general facility for running pre-upgrade scripts (and/or package installation/upgrade). Like maybe we have upgrade metapackages in each release which has the necessary Requires and %pre/%post scripts.
For example: the F21 repo could contain fedora-upgrade-22 and fedora-upgrade-23 packages which we could (somehow) attempt to install before the upgrade attempt.
There's a problem here, though - it's very tricky for a plugin to get DNF set up with repos for both the current release *and* the target release; it would require us to add an extra step before the upgrade reboot.
Basically what I'm saying here is: yes, in theory, this concept has some merit. But implementing it correctly requires significant design and engineering work, so it's probably not gonna happen for F23.
The problem doesn't seem serious enough to justify having to maintain a hacky workaround for the next couple of years. It's a corner-case and the workaround is "just walk away for a while, it's gonna be fine."
Let's just document the potential problem for now, and discuss implementing the general solution on the mailing lists and/or github.
I think it's better to simply document this in https://fedoraproject.org/wiki/DNF_system_upgrade. Just put the command to run in the list of commands to run.
plymouth-0.8.9-10.2013.08.14.fc22 has been submitted as an update to Fedora 22. https://bodhi.fedoraproject.org/updates/FEDORA-2015-d771412e5b
plymouth-0.8.9-10.2013.08.14.fc22 has been pushed to the Fedora 22 testing repository. If problems still persist, please make note of it in this bug report.
If you want to test the update, you can install it with
$ su -c 'dnf --enablerepo=updates-testing update plymouth'
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2015-d771412e5b
Can you apply the same fix to f23 and master branches too? Thanks!
plymouth-0.8.9-8.2013.08.14.fc21 has been submitted as an update to Fedora 21. https://bodhi.fedoraproject.org/updates/FEDORA-2015-4a5b0b71c7
plymouth-0.8.9-11.2013.08.14.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2015-08ea674d2c
plymouth-0.8.9-11.2013.08.14.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report.
If you want to test the update, you can install it with
$ su -c 'dnf --enablerepo=updates-testing update plymouth'
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2015-08ea674d2c
plymouth-0.8.9-10.2013.08.14.fc22 has been pushed to the Fedora 22 stable repository. If problems still persist, please make note of it in this bug report.
plymouth-0.8.9-11.2013.08.14.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report.
On fc22, upgraded to plymouth-0.8.9-10.2013.08.14.fc22, ran
sudo plymouth-set-default-theme charge
sudo dracut -f
and then the update from fc22 to fc23 ran with a black screen.
But ... this laptop never seems to use plymouth anyway.
It does claim to "Starting Show Plymouth Scree", "Reached ..initialization" and stays there until the XMD login appeas, so it may be a different problem.