When trying to do a kickstart install on real hardware (especially a partial kickstart where some spokes are not configured), the installer UI does not get drawn. What happens instead is that it boots up with a black screen and the cursor. In a fully automatic install, this is just merely annoying as it goes through the installation and reboots. In a partially configured install, this means that you cannot see the spokes to configure to start the installation. The UI is there, just not drawn, which means you can still click on things and navigate the UI if you know it by memory (which is probably not something we expect people to do). Reproducible: Always Steps to Reproduce: 1. Download the F39 "Everything boot" ISO from nightly.fedoraproject.org 2. Boot it on a real system, modifying the boot args for install to add 'inst.dhcp inst.ks="https://ngompa.fedorapeople.org/binoc-test-fc39.ks"' to the boot arguments Actual Results: Anaconda loads but nothing is drawn. If you know the layout of the installer by memory, you can navigate since all the buttons and widgets are there, they just aren't drawn properly. Expected Results: Anaconda loads and the UI is drawn properly. You can navigate the UI properly without having to work from memory. This is only reproducible on real hardware. When using KVM or VMware, it works perfectly fine. I have been able to trigger this with every compose going back to when we branched from Rawhide. I suspect this has been around for a while... :(
This is also reproducible with the Server netinstall ISO (since it's the same thing just with server branding instead...).
Proposed as a Blocker for 39-final by Fedora user ngompa using the blocker tracking app because: This violates the criterion "The installer must be able to complete an installation using all supported interfaces." as the user cannot reasonably complete installation using the supported kickstart+graphical interface.
I can confirm this problem even in a libvirt VM (both BIOS and UEFI). It happened to me in 6/6 attempts. I can also reproduce it using https://fedorapeople.org/groups/qa/kickstarts/example-minimal.ks , so it's not specific to Neal's kickstart. I tested with Fedora-Everything-netinst-x86_64-39-20231002.n.0.iso. I'm attaching logs below.
Created attachment 1991608 [details] anaconda.log
Created attachment 1991609 [details] dbus.log
Created attachment 1991610 [details] hawkey.log
Created attachment 1991611 [details] journal.txt
Created attachment 1991612 [details] packaging.log
Created attachment 1991613 [details] program.log
Created attachment 1991614 [details] storage.log
Created attachment 1991615 [details] syslog
Created attachment 1991616 [details] X.log
Discussed during the 2023-10-02 blocker review meeting: [1] The decision to classify this bug as a AcceptedBlocker (Final) was made: "This is accepted as a violation of the following criterion: "The installer must be able to complete an installation using all supported interfaces." as the user cannot reasonably complete installation using the supported kickstart+graphical interface." [1] https://meetbot-raw.fedoraproject.org/fedora-blocker-review/2023-10-02/f39-blocker-review.2023-10-02-16.01.log.txt
So you're telling me this actually got *worse* over the course of F39? Because at the beginning, I'm pretty sure it worked in my VM environments, just not in real hardware. I'm glad it's reproducible everywhere now, though. It means we can add a test to keep this from happening again without telling people to buy a rando mini PC to run the tests. :)
it may depend on the hw the VM is emulating.
Some notes from reviewing the logs. X.log has the following lines: > (EE) Failed to load module "fbdev" (module does not exist, 0) > (EE) Failed to load module "vesa" (module does not exist, 0) > (EE) modeset(0): glamor initialization failed This is the same on rawhide which successfully shows something. journal has: > display: failed to call GetCurrentState from mutter over DBUS Ditto, same on rawhide which shows ok, also on RHEL 9.
I don't think any of those messages matter.
The deciding factor seems to be presence of "keyboard" in kickstart. With this, I get either the black screen or gnome-kiosk crash. I tried some combinations of --vckeymap and --xlayouts, as well as multiple languages. Does not seem to matter.
This was happening at least as far back as Fedora-39-20230910.n.0 - that's the oldest run the openQA video is still present for (videos from earlier runs have been garbage collected unfortunately).
Based on some debugging vslavik did, we think this is actually a systemd-localed issue. The bug can be avoided by disabling every point on this codepath where anaconda makes a dbus call to ask localed to load an X layout - lines 150, 164 and 166 of pyanaconda/modules/localization/runtime.py . That code has not changed in anaconda recently. Separately, I tested random images I had lying around and found that this broke between Fedora-Everything-netinst-x86_64-Rawhide-20230713.n.1.iso and Fedora-Everything-netinst-x86_64-39-20230828.n.0.iso . That's the window during which systemd 254 landed. So I kinda suspect something in systemd 254 broke this.
hum. well. I tried Rawhide images with both systemd 253.12 and 253.1 and...they still do this. So...maybe it's not systemd? But if not, honestly, I'm not sure what *else* it might be. Hum.
Also still happens on a Rawhide image with anaconda-39.23-3 (same version that was in Fedora-Everything-netinst-x86_64-Rawhide-20230713.n.1.iso ). So I'm a bit stuck now. Have to do some thinking about what else could possibly be causing this. dbus?
So I at least managed to narrow the delta a bit, after realizing that Silverblue installers are affected too. I happen to have a couple of those in a narrower range: Fedora-Silverblue-ostree-x86_64-Rawhide-20230728.n.0.iso - GOOD Fedora-Silverblue-ostree-x86_64-39-20230815.n.0.iso - BAD so now we're down to "something that changed between July 28 and August 15".
Oooh! My latest wild shot in the dark seems to be a hit. I built a current Rawhide image with old gnome-kiosk and mutter - gnome-kiosk-44.0-2.fc39.x86_64 and mutter-44.2-2.fc39.x86_64 . That image does not have the bug. So this appears to be a bug there, probably mutter (I had to downgrade both because gnome-kiosk is built against mutter and the soname changed).
I guess I should clarify (since vslavik said he can't reproduce the bug on Rawhide) that for me the bug reproduces every single try (in a VM) with both current Rawhide and F39 images. So the fact that it works OK with a Rawhide image with downgraded mutter/gnome-kiosk clearly implicates one of those packages.
Brief summary for Workstation folks: kickstart installs using a kickstart with a 'keyboard' directive often do not display the anaconda UI, they just show a blank screen (though if the kickstart is fully complete, the install will run to completion). For me this is 100% reproducible in a VM booting with inst.ks=https://fedorapeople.org/groups/qa/kickstarts/base-net.ks . We can prevent the bug happening by disabling all anaconda's calls to systemd-localed (via dbus) to set X keyboard layout based on the kickstart contents (this is obviously not a fix, but a significant diagnostic fact). These calls all happen (I believe, and per the logs) before the X server is started. The bug does not happen with mutter and gnome-kiosk downgraded as per comment #24.
(In reply to Adam Williamson from comment #24) > Oooh! My latest wild shot in the dark seems to be a hit. > > I built a current Rawhide image with old gnome-kiosk and mutter - > gnome-kiosk-44.0-2.fc39.x86_64 and mutter-44.2-2.fc39.x86_64 . That image > does not have the bug. So this appears to be a bug there, probably mutter (I > had to downgrade both because gnome-kiosk is built against mutter and the > soname changed). This seems to be in exact agreement with what we were seeing in our kickstart tests: https://github.com/rhinstaller/kickstart-tests/issues/997#issuecomment-1676845348 Sorry for being so late here, I should have followed up on the issue back then, but we had other pressing priorities at that time, plus I guess we saw the issue as quite a rare flake because in the most cases the installation went on just with the black screen, which kickstart tests can't detect.
OK, so I bisected this. This is the tightest I can bisect it - builds of any of these commits cause gnome-kiosk to crash on startup: The first bad commit could be any of: 0f88f0931c11431354556b1ffaae082048e98777 3e95609073b3a455693e19e58b365688b7f877ba a27b9d9707b0c5ccfd6aec3e5f335937c1796429 02a436d607481492a37ad15fcc401abf6385eeff 761a254e6f8b8643ce6530e85daf041f25edc683 15b25568b29ec0e082f6a18fef550078102aaca1 We cannot bisect more! all those commits were part of https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/2445 .
One more bit of data: if I hack up anaconda to wipe /etc/X11/xorg.conf.d/00-keyboard.conf - that's the file localed writes when you ask it to set an X keyboard layout - before it starts X, the bug also goes away. So it seems like the definition of the bug is more or less: from mutter 15b25568b29ec0e082f6a18fef550078102aaca1 onwards, gnome-kiosk on X.org as launched by anaconda displays a blank screen if an /etc/X11/xorg.conf.d/00-keyboard.conf with contents like this is present: # Written by systemd-localed(8), read by systemd-localed and Xorg. It's # probably wise not to edit this file manually. Use localectl(1) to # instruct systemd-localed to update it. Section "InputClass" Identifier "system-keyboard" MatchIsKeyboard "on" Option "XkbLayout" "us" Option "XkbModel" "pc105" EndSection
Bit more progress - https://gitlab.gnome.org/GNOME/mutter/-/issues/3089#note_1868806
> since vslavik said he can't reproduce the bug on Rawhide Sorry, maybe that was a bit misleading. These were graphics-related log messages, which I eliminated as a successful start had the same. I could reproduce the bug on Rawhide too.
sadly the posted patch does not appear to fix the bug.
FEDORA-2023-16d9c333e4 has been submitted as an update to Fedora 39. https://bodhi.fedoraproject.org/updates/FEDORA-2023-16d9c333e4
Just tried this new mutter update mutter-45.0-10.fc39.x86_64.rpm. I use cqrprop https://github.com/ok2cqr/cqrprop/releases which displays a window on the desktop with solar data in it. With this updated mutter the window does not appear on the desktop although the window outline is shown on my bottom panel's workspace view. Downgrading to mutter-45.0-9.fc39.x86_64.rpm makes cqrprop work normally again. I don't know what this backport actually changed.
Should have said, using GNOME Wayland desktop with all available F39 rpm updates from updates-testing.
Thanks for the feedback. When it's done, can you try https://koji.fedoraproject.org/koji/taskinfo?taskID=107839077 ? That's a build that reduces the change in -10 to (hopefully) the smallest needed to fix the blocker bug we're trying to fix. It'd be good to know if it avoids the problem you saw.
Tried the mutter-45.0-11 build, unfortunately I still see the same problem. It could be something about the way cqrprop is coded, but it has never done this before now with these last two mutter packages.
as the proposed fix breaks at least two other things, one of which is obviously release-blocking (you can't see anaconda on the Workstation live image), it's no good. setting back to ASSIGNED.
FEDORA-2023-16d9c333e4 has been pushed to the Fedora 39 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-16d9c333e4` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-16d9c333e4 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
Brian: new build to try - can you try https://koji.fedoraproject.org/koji/taskinfo?taskID=107860827 ? it adds another change that might solve the problem.
OK, so I have installed the second build of the mutter-45.0-11 package and I now see the solar data window on the desktop with the correct contents. Here's hoping this also fixes the anaconda and black screen problems.
I too have installed mutter-45.0-11 from Koji and found that Chrome, VSCode, and Discord all render properly with it under Wayland. Is it useful information that 45.0-10 was able to render Chrome and VSCode under an Xorg session but not under a Wayland one?
It helps us confirm the issue, yes. Thanks.
Could we switch Anaconda to start as an XWayland app instead of an X11 app? I would think that would work around this issue.
that's way too much change. we actually have a working fix upstream now, anyway.
So where's the update for the fix?
I was waiting for jadahl to turn it into a proper MR that would get some review. It's not particularly urgent as we still have ARM blockers. But if there's no movement soon I'll backport it. The working fix is https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/3329#note_1874837 , an additional change on top of the changes in that MR. Scratch build at https://koji.fedoraproject.org/koji/taskinfo?taskID=107860827 (I think that's the right one).
The link goes to mutter-45.0-11.fc39, which I am happy to report seems work as intended (I am writing this from a Chrome window in a wayland session). I'll be happy to provide karma to the update and/or retest when it hits bodhi
The new build is now in the update, please re-test and re-karma.
Would it be possible that the update is included in a nightly build of Fedora 39 iso image (server?)? Then I can update my kickstart environment and test if the installation screen will not get black?
Not before it's pushed stable, but I've already verified that part of the fix several times.
Fix confirmed in RC-1.2.
FEDORA-2023-16d9c333e4 has been pushed to the Fedora 39 stable repository. If problem still persists, please make note of it in this bug report.
Any chance we can have a new netinstall ISO published with this fix?
It's already in Final RC-1.2, and nightlies from today onwards - https://openqa.fedoraproject.org/nightlies.html
Thank you! I can confirm that I see the graphical anaconda install screen when using Fedora-Everything-netinst-x86_64-39-20231030.n.0.iso