Something changed between kernel-6.13.0-0.rc0.20241126git7eef7e306d3c.10.fc42 and kernel-6.13.0-0.rc1.20241202gite70140ba0d2b.14.fc42 which made kernel-6.13.0-0.rc1.20241202gite70140ba0d2b.14.fc42 and all subsequent kernels become non-working Instead, I see a lot of "failed to validate module" messages in the terminal during boot. But the upstream kernel I built at the same commit and .config works fine. Reproducible: Always
Created attachment 2061419 [details] Terminal photo
Proposed as a Blocker and Freeze Exception for 42-beta by Fedora user mikhail using the blocker tracking app because: Some changes in the rhel patchset completely made all my systems unbootable.
It boots fine on openQA (or else it wouldn't have passed gating, and all the Rawhide validation tests would fail). On my system kernel-6.13.0-0.rc1.20241203gitcdd30ebb1b9f.16.fc42.x86_64 doesn't work for graphics, but does at least get me to a console (I hadn't got time to look into why, yet). I'm not seeing this BPF stuff.
Adam, please test the debug kernel. # dnf install kernel-debug kernel-debug-modules-extra This issue only affected the debug kernel. A non-debug kernel works as intended.
Mikhail, is this still happening? Anyhow, if it only affects the debug kernel, I don't think it can be a blocker, as no install uses that by default...
Created attachment 2075031 [details] Terminal photo (In reply to Adam Williamson from comment #5) > Mikhail, is this still happening? Yes, the latest builds https://koji.fedoraproject.org/koji/buildinfo?buildID=2649629 still not work. The messages in the terminal have changed a bit, I suspect due to a problem with dwarves package. https://bugzilla.redhat.com/show_bug.cgi?id=2342785 > Anyhow, if it only affects the debug kernel, I don't think it can be a > blocker, as no install uses that by default... The debug kernel allows you to see many problems that are usually hidden. That is why I use the debug kernel on a daily basis.
That's a good reason to use it for testing, but it doesn't mean bugs in it are a release blocker.
I'm hitting this also. These kernels do not boot but their non-debug equivalent versions boot fine. What I get is a bunch of failed to validate module messages and then an apparent hang, no plymouth prompt to unlock the root volume, ESC key does nothing. kernel-debug-6.14.0-0.rc3.29.fc42.x86_64 kernel-debug-6.13.3-200.fc41.x86_64
Created attachment 2077149 [details] virsh console "dmesg" Reproduced it in qemu/kvm. Other than having UEFI enabled (without Secure Boot), it's a stock VMM VM.
Fedora-Workstation-Live-Rawhide-20250219.n.0.x86_64.iso is using 6.14.0-0.rc3.29.fc43.x86_64 which is a no-debug kernel. This is probably why OpenQA hasn't caught this problem.
Just to be extra sure, looking in /run/rootfsbase/usr/lib/modules/6.14.0-0.rc3.29.fc43.x86_64/config I see: # CONFIG_KASAN is not set # CONFIG_BTRFS_ASSERT is not set And at least those two things are set on Fedora debug kernels. And still another way to check is the kernel file size, non-debug are 16-17M. Debug are 31-32M. root@localhost-live:~# ls -lsh /run/initramfs/live/boot/x86_64/loader/linux 17M -rwxr-xr-x. 1 root root 17M Feb 19 06:44 /run/initramfs/live/boot/x86_64/loader/linux
Fails in both UEFI and BIOS qemu/kvm.
> Fedora-Workstation-Live-Rawhide-20250219.n.0.x86_64.iso is using 6.14.0-0.rc3.29.fc43.x86_64 which is a no-debug kernel. This is probably why OpenQA hasn't caught this problem. Well, yes, that's what all my comments above mean. It also means the bug isn't particularly critical; it just makes debugging kernel problems harder.
Anyway, it's regression. The user can remove all non-debug kernels, and after upgrading to the next Fedora release, the system became broken.
-4 in https://pagure.io/fedora-qa/blocker-review/issue/1745 , marking rejected blocker. FE vote is still open.
Discussed on 2025-02-24 in a blocker review meeting [1]: !agreed 2330681 - AcceptedBetaFE - We would like to fix debug kernels ASAP, and we don't ship them on any medium, so this should be a safe freeze exception to grant. [1] https://meetbot.fedoraproject.org/blocker-review_matrix_fedoraproject-org/2025-02-24/f42-blocker-review.2025-02-24-17.01.log.html
*** Bug 2334643 has been marked as a duplicate of this bug. ***
Useful comment from the other bug, from Jason Montleon: From serial I collected some output: ``` [ 9.670579] BPF: [145778] ENUM ee [ 9.672260] BPF: size=4 vlen=53 [ 9.673775] BPF: [ 9.675183] BPF: Invalid name [ 9.676689] BPF: [ 9.678155] failed to validate module [fuse] BTF: -22 [ 9.901438] BPF: [145778] ENUM ee [ 9.903195] BPF: size=4 vlen=53 [ 9.904717] BPF: [ 9.906061] BPF: Invalid name [ 9.907546] BPF: [ 9.908922] failed to validate module [fuse] BTF: -22 [ 9.994502] BPF: type_id=350 bits_offset=64 [ 9.996322] BPF: [ 9.997616] BPF: Invalid name [ 9.999123] BPF: [ 10.000428] failed to validate module [scsi_dh_alua] BTF: -22 [ 10.065557] BPF: [145788] FUNC [ 10.067110] BPF: type_id=199 [ 10.068530] BPF: [ 10.069743] BPF: Invalid name [ 10.071136] BPF: [ 10.072428] failed to validate module [scsi_dh_emc] BTF: -22 [ 10.143174] BPF: type_id=18 bits_offset=296 [ 10.144713] BPF: [ 10.145914] BPF: Invalid name [ 10.147255] BPF: [ 10.148445] failed to validate module [scsi_dh_rdac] BTF: -22 [ 10.242935] systemd[1]: systemd-modules-load.service: Main process exited, code=exited, status=1/FAILURE [ 10.261194] systemd[1]: systemd-modules-load.service: Failed with result 'exit-code'. [ 10.279406] systemd[1]: Failed to start systemd-modules-load.service - Load Kernel Modules. [FAILED] Failed to start systemd-modules-load.service - Load Kernel Modules. See 'systemctl status systemd-modules-load.service' for details. ```
What is the libbpf version? There were some bugs that were only fixed in 1.5.0, but also later backport to 1.4.7.
Want to give https://koji.fedoraproject.org/koji/taskinfo?taskID=129948696 a try?
Created attachment 2079293 [details] 6.14.0-0.rc5.20250307git00a7d39898c8.47.fc43.x86_64+debug journal This boots so it is much better. I do still see two cases Invalid offset, but I can't see what module(s) might be causing them. I have uploaded the journal in case someone else can pick it out. ``` Mar 07 15:54:41 fedora kernel: BPF: type_id=3067 offset=0 size=1 Mar 07 15:54:41 fedora kernel: BPF: Mar 07 15:54:41 fedora kernel: BPF: Invalid offset Mar 07 15:54:41 fedora kernel: BPF: ```
Tested fixed in kernel-debug-6.14.0-0.rc6.49.fc42.x86_64, reported fixed in 6.14.0-0.rc5.2a520073e74f.47
Re-opening for F42 tracking.
FEDORA-2025-1b8a020e07 (kernel-6.14.0-0.rc6.49.fc42 and kernel-headers-6.14.0-0.rc6.49.fc42) has been submitted as an update to Fedora 42. https://bodhi.fedoraproject.org/updates/FEDORA-2025-1b8a020e07
FEDORA-2025-1b8a020e07 has been pushed to the Fedora 42 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2025-1b8a020e07` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2025-1b8a020e07 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
The latest update of the debug kernel boots normally.
FEDORA-2025-1b8a020e07 (kernel-6.14.0-0.rc6.49.fc42 and kernel-headers-6.14.0-0.rc6.49.fc42) has been pushed to the Fedora 42 stable repository. If problem still persists, please make note of it in this bug report.