Description of problem: Boot an ISO, get to the grub menu, see the "booting command list" message, but no other messages after that, no kernel boot scroll (with quiet removed), no errors - reboot back to grub menu. Version-Release number of selected component (if applicable): HOST = Fedora Server 43 (Intel NUC) libvirt-daemon-driver-qemu-11.6.0-1.fc43.x86_64 qemu-kvm-10.1.0-7.fc43.x86_64 edk2-ovmf-20250812-16.fc43.noarch kernel 6.17.0-63.fc43.x86_64 How reproducible: 100% on an Intel NUC; 0% on a Lenovo Thinkpad same kernel, same userspace components (Fedora Server on the NUC, Fedora Workstation on the Thinkpad) These same images boot OK on Fedora 42 server with latest updates Steps to Reproduce: 1. host selinux is permissive 2. Boot any of the following ISOs Fedora-Server-dvd-x86_64-43-20250930.n.0.iso Fedora-Server-dvd-x86_64-43-20251003.n.0.iso Fedora-Workstation-Live-42-1.1.x86_64.iso Fedora-Workstation-Live-43-20250930.n.0.x86_64.iso Actual results: Fails boot the kernel, reboots to grub menu Expected results: Boot (at least to dracut shell) Additional info: Using grub debug=all, the kernel and initramfs both read off the ISO without error. Failure is happening at during the boot command but I have no idea if we're still in grub, handed back to shim, or what. See attachment grubdebugall.txt Crash/reboot is happening at 56:04 after the hour in all the logs
Created attachment 2108527 [details] virsh dumpxml
Created attachment 2108528 [details] systemd journal
Created attachment 2108529 [details] grub debug=all edit the menu entry, add debug=all under initrd line, then control-x
Created attachment 2108530 [details] /var/log/libvirt/qemu/UEFI.log
Proposed as a Blocker for 43-final by Fedora user chrismurphy using the blocker tracking app because: The release must be able host virtual guest instances of the same release. https://fedoraproject.org/wiki/Fedora_43_Beta_Release_Criteria#Self_hosting_virtualization NOTE: so far it appears to be host specific
Created attachment 2108531 [details] cpuinfo flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology tsc_reliable nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 movbe popcnt tsc_deadline_timer aes rdrand lahf_lm 3dnowprefetch epb ibrs ibpb stibp tpr_shadow flexpriority ept vpid tsc_adjust smep erms dtherm ida arat vnmi md_clear vmx flags : vnmi preemption_timer invvpid ept_x_only flexpriority tsc_offset vtpr mtf vapic ept vpid unrestricted_guest bugs : cpu_meltdown spectre_v1 spectre_v2 mds msbds_only spectre_v2_user
BIOS VM is booting OK.
The problem does not happen after downgrading to edk2-ovmf-20250523-18.fc43.noarch.rpm With edk2-ovmf-20250812-16.fc43.noarch.rpm again, the problem returns. And also this message appears after kernel and initramfs load, during the grub boot command, but only with the newer edk2-ovmf present. PageFaultExitBoot: Applying global page table fixup (shim is older than v16). Even if it's an apparent regression in edk2, that doesn't really tell us if the problem is actually in shim or grub.
Can you test https://copr.fedorainfracloud.org/coprs/kraxel/edk2.testbuilds/build/9647432/ with the intel nuc? I assume the nuc is somewhat older?
I could not reproduce this on my machine, Advanced Micro Devices, Inc. [AMD] Strix/Strix Halo Root Complex (rev 02). I tried with KDE Live and it booted normally in a libvirt KVM VM.
edk2-ovmf-20250812-18.copr9647432.noarch.rpm works OK This NUC's original firmware is dated 2015, and its current firmware is dated 2022. I think I bought it in 2016.
openQA runs almost all its tests in UEFI qemu VMs, so this is definitely specific to Chris' setup somehow.
AGREED Delayed Decision (punt) Discussed at the 2025-10-06 (blocker / freeze exception) review meeting: We agreed to punt this so we can ask Gerd exactly what combination of properties is necessary to trigger this bug, and make an assessment of how common that's likely to be. https://meetbot-raw.fedoraproject.org//blocker-review_matrix_fedoraproject-org/2025-10-06/f43-blocker-review.2025-10-06-16.00.txt
Looking at the change in the scratch build, it seems the change was a patch "OvmfPkg/PlatformDxe: add check for 1g page support". AFAICS the practical effect of this change is that we now skip doing the "global RW+NX fixup" (whatever that is) if a new check called "PageFaultHave1GPages" fails. I think that, aside from that check, Fedora systems would always want to do the "global fixup" because they have an 'old' shim so far as edk2 is concerned (edk2 defines 'old' as < 16, and the newest shim in any Fedora is 15). (It might also be skipped if confidential computing is in play, but that's pretty rare). So...I guess this might affect any case where that new "PageFaultHave1GPages" check would fail, is that correct, Gerd? Do we have any idea how common that might be? It would be useful information to decide whether this is a blocker, or an FE, or nothing...
(In reply to Adam Williamson from comment #14) > Looking at the change in the scratch build, it seems the change was a patch > "OvmfPkg/PlatformDxe: add check for 1g page support". AFAICS the practical > effect of this change is that we now skip doing the "global RW+NX fixup" > (whatever that is) if a new check called "PageFaultHave1GPages" fails. Yes. Bug was that the code used gigabyte pages without checking the cpu actually supports them. Gigabyte pages are a thing for at least a decade, so you need relatively old hardware to run into this. > I > think that, aside from that check, Fedora systems would always want to do > the "global fixup" because they have an 'old' shim so far as edk2 is > concerned (edk2 defines 'old' as < 16, and the newest shim in any Fedora is > 15). (It might also be skipped if confidential computing is in play, but > that's pretty rare). Well. There have been a bunch of NX-related bugs in the boot chain. In shim, in grub, in the kernel efi stub. Depending on which combination of bugs you have you can see a bunch of different effects. shim and kernel are fine as far as fedora is concerned (i.e. all non-EOL versions have new enough builds). grub is the last buggy component. Fedora 43 + Rawhide have fixed builds, Fedora 41 + 42 still had broken builds last time I checked. The "global RW+NX fixup" is a workaround needed by some combinations of broken grub + kernel. Fedora guests do NOT need this. It is needed to boot some older RHEL versions and IIRC debian 12 too. The check for the shim version is there because there is no easy way to figure whenever that workaround is needed. It is easy to figure whenever shim 16+ is present though because it installs a new EFI protocol you can check for. So I'm using that as indicator for recent and (hopefully) NX-bug-free linux distribution being booted, and turn off the workaround then, to avoid carrying it forward forever.
FEDORA-2025-addc7b1924 (edk2-20250812-18.fc43) has been submitted as an update to Fedora 43. https://bodhi.fedoraproject.org/updates/FEDORA-2025-addc7b1924
FEDORA-2025-addc7b1924 has been pushed to the Fedora 43 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2025-addc7b1924` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2025-addc7b1924 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
edk2-ovmf-20250812-18.fc43.noarch continues to work, fixes the bug.
Setting VERIFIED per comment 18
+4 FE in https://pagure.io/fedora-qa/blocker-review/issue/1956 , marking accepted FE.
FEDORA-2025-addc7b1924 (edk2-20250812-18.fc43) has been pushed to the Fedora 43 stable repository. If problem still persists, please make note of it in this bug report.