The new grub update failed CoreOS tests. None of the created images boot and fail unpacking the initramfs. ``` [ 0.759308] Initramfs unpacking failed: ZSTD-compressed data is corrupt ``` Normally I wouldn't start with investigating GRUB when seeing an error message like this, but our tests are designed to only test the update in question, so GRUB was the only software that changed. Sticking with the previous version of GRUB does not yield any failed tests. You can download an image to experiment with at: https://dustymabe.fedorapeople.org/fedora-coreos-43.20250219.dev.0-qemu.x86_64.qcow2.xz Remember to decompress it first. I'll attach a full console log as well. Reproducible: Always
Created attachment 2077213 [details] console.txt
looks like it's failing to start systemd... could you check to see what the permissions on it are? [ 1.200453] Run /init as init process [ 1.200885] Failed to execute /init (error -26) [ 1.201323] Run /sbin/init as init process [ 1.201696] Run /etc/init as init process [ 1.202098] Run /bin/init as init process [ 1.202507] Starting init: /bin/init exists but couldn't execute it (error -26) [ 1.203142] Run /bin/sh as init process [ 1.203529] Starting init: /bin/sh exists but couldn't execute it (error -26) [ 1.204123] Kernel panic - not syncing: No working init found. Try passing init= option to kernel. See Linux Documentation/admin-guide/init.rst for guidance. ...
Hey Marta, I initially focused on that too, but I think this is just a symptom of not being able to unpack the initramfs (look higher up in the log): ``` [ 0.759308] Initramfs unpacking failed: ZSTD-compressed data is corrupt ``` So it seems like it unpacks some of it, but not all?? Also if you run the test over and over it will fail in slightly different ways. For example sometimes init runs but then fails when trying to mount sysroot.mount. Can we focus on the initramfs unpacking problem first?
Ok, I didn't know that it fails at different times later on... have you tried rebuilding the initramfs? I downloaded your qcow and I'll try it later.
I haven't tried rebuilding the initramfs of the disk because the system won't boot so rebuilding the initramfs in place isn't really an option (or at least it isn't easy to do). I can confirm that this isn't an isolated build problem. It failed in CI and also on my local system. So that's two different image builds (two different initramfs builds) in two different environments having the same results. Also worth emphasizing here: the code that generates the initramfs hasn't changed, just GRUB.
Hey Dusty, There's nothing obvious in the newest GRUB that should cause this... they are all CVE fixes for OOB reads and writes (type changes), using safe math for adding, subtracting, etc. Leo was worried that one patch, which disables a bunch of filesystems during lockdown, was the culprit, but I also see your error when booting the image with SB disabled, while rpm installation with SB enabled works... How do we test this? Is there an easy / straightforward way of creating the images with a different version of GRUB?
You can build using `COSA` [1] overriding grub with your local development RPM build by placing the RPM files in overrides/rpm/[2]. Since we are building against `rawhide` you'd want to `cosa init --branch rawhide https://github.com/coreos/fedora-coreos-config`. If that is too complicated reach out to me in https://matrix.to/#/#coreos:fedoraproject.org or I can also test out development builds for you. [1] https://github.com/coreos/coreos-assembler/blob/main/docs/building-fcos.md#downloading-the-container [2] https://github.com/coreos/coreos-assembler/blob/main/docs/working.md#using-overrides
Thank you for the instructions. A new patch appeared upstream a couple of days ago https://lists.gnu.org/archive/html/grub-devel/2025-02/msg00115.html and it appears to fix this issue. Nicolas built a scratch build for rawhide. It works for me...no kernel panic, machine boots. Please try it yourself, just so we are sure. If it's ok, it can land in rawhide probably tomorrow https://koji.fedoraproject.org/koji/taskinfo?taskID=129608711
The build does appear to pass all test except secureboot tests, which I assume is expected because it is a scratch build. ✔️
Yes, that's right. We noticed you tagged in 2.12-23, which means things are working for you now..? You will get a new build soon :)
Fixed in grub2-2.12-25.fc43