Bug 2181825

Summary: CONFIG_EFI_ZBOOT kernel unbootable by grub2
Product: [Fedora] Fedora Reporter: Laszlo Ersek <lersek>
Component: grub2Assignee: Robbie Harwood <rharwood>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 36CC: fmartine, jforbes, kraxel, lkundrak, nfrayer, pgnet.dev, pjones, rharwood
Target Milestone: ---   
Target Release: ---   
Hardware: aarch64   
OS: Unspecified   
Whiteboard:
Fixed In Version: grub2-2.06-62.fc36 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-03-31 02:42:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Laszlo Ersek 2023-03-26 10:47:23 UTC
*** Description of problem:

After a relatively long time, I've updated my aarch64 Fedora install
today. The highest installed kernel version jumped from
6.0.12-200.fc36.aarch64 to 6.2.7-100.fc36.aarch64. The grub2 version
number changed from 2.06-57.fc36.aarch64 to 2.06-61.fc36.aarch64. The
new grub version is unable to boot the new kernel version. The new grub
version is capable of booting the old/previous kernel version.

*** Version-Release number of selected component (if applicable):

kernel-6.2.7-100.fc36.aarch64
grub2-efi-aa64-2.06-61.fc36.aarch64

*** How reproducible:

Always.

*** Steps to Reproduce:
1. (dnf update)
2. (reboot)
3. select "Fedora Linux (6.2.7-100.fc36.aarch64) 36 (Server Edition)" in
   the grub2 menu

*** Actual results:

grub2 prints:

> error: ../../grub-core/loader/arm64/linux.c:59:invalid magic number.
> error: ../../grub-core/loader/arm64/linux.c:283:you need to load the
> kernel first.

*** Expected results:

The kernel should boot.

*** Additional info:

- This symptom is presumably the consequence of: (a) the Fedora 36
  kernel enabling CONFIG_EFI_ZBOOT in dist-git commit c24896d82884
  ("kernel-6.2.6-100", 2023-03-13), and (b) Fedora 36 grub2 missing
  upstream commit 69edb3120560 ("loader/arm64/linux: Remove magic number
  header field check", 2022-08-19).

- While browsing the Fedora kernel dist-git (and the %changelog in the
  spec file), I've found two related entries:

> * Wed Feb 15 2023 Fedora Kernel Team <kernel-team> [6.2.0-0.rc8.e1c04510f521.58]
> - aarch64: enable zboot (Gerd Hoffmann)

  and chronologically earlier:

> * Tue Dec 13 2022 Fedora Kernel Team <kernel-team> [6.2.0-0.rc0.764822972d64.1]
> - Turn off CONFIG_EFI_ZBOOT as it makes CKI choke (Justin M. Forbes)

- Regarding the patch that enabled EFI_ZBOOT for aarch64, I couldn't
  find an RHBZ, but searching the web leads me to:

  [OS-BUILD PATCH] aarch64: enable zboot
  https://www.spinics.net/linux/fedora/fedora-kernel/msg16860.html

  [OS-BUILD PATCHv2] aarch64: enable zboot
  https://www.spinics.net/linux/fedora/fedora-kernel/msg16862.html

  [OS-BUILD PATCHv3] aarch64: enable zboot
  https://www.spinics.net/linux/fedora/fedora-kernel/msg16875.html

  and then to
  <https://gitlab.com/cki-project/kernel-ark/-/merge_requests/2283>, and
  then to <https://github.com/systemd/systemd/issues/23788>.

  So it looks like the new kernel is supposed to work with BLS /
  sd-boot; it does not work with grub2 however, at the moment. I don't
  see dowstream (F36) grub2 being covered / verified in the above
  discussions.

- The problem is relatively serious because it prevents a system-upgrade
  to Fedora 37 -- "dnf system-upgrade" requires the system to be fully
  up-to-date as a baseline.

Comment 1 Gerd Hoffmann 2023-03-27 08:44:51 UTC
> kernel-6.2.7-100.fc36.aarch64
> grub2-efi-aa64-2.06-61.fc36.aarch64

>   So it looks like the new kernel is supposed to work with BLS /
>   sd-boot; it does not work with grub2 however, at the moment. I don't
>   see dowstream (F36) grub2 being covered / verified in the above
>   discussions.

It works fine for me with grub2, although I've tested on F37 only,
which has right now:
 - grub2-efi-aa64-2.06-89.fc37.aarch64
 - kernel-core-6.2.7-200.fc37.aarch64

Given grub2 is a standalone binary pulling the F37 version should work
(dnf --releasever=37 upgrade grub-efi-aa64).

Looking at the rpm changelog there is no obvious candidate
between -61 and -89 which could have fixed this.

I'm not fully sure whenever F36 and F37 use the same numbering
though. There is an older entry which possibly refers to zboot:

* Thu Sep 08 2022 Robbie Harwood <rharwood> - 2.06-57
- aa64: support pe/coff decompressor

Comment 2 Laszlo Ersek 2023-03-27 09:41:44 UTC
(In reply to Gerd Hoffmann from comment #1)

> Looking at the rpm changelog there is no obvious candidate
> between -61 and -89 which could have fixed this.
> 
> I'm not fully sure whenever F36 and F37 use the same numbering
> though.

Comparing these dist-git branches: they diverge after commit 714559fb3d58 ("Handle ostree's non-writable /etc/kernel", 2022-08-17); the last common Release field is -52. -53 and onward have different meanings on both branches.

> There is an older entry which possibly refers to zboot:
> 
> * Thu Sep 08 2022 Robbie Harwood <rharwood> - 2.06-57
> - aa64: support pe/coff decompressor

Right, this is specific to the f37 branch, and it corresponds to dist-git commit c50cc54b88d3 ("aa64: support pe/coff decompressor", 2022-09-08). And that patch cherry-picks precisely the upstream commit I mention above (69edb3120560, "loader/arm64/linux: Remove magic number header field check", 2022-08-19). This dist-git commit in fact leads us to bug 2125069.

I figure at the time of <https://bugzilla.redhat.com/show_bug.cgi?id=2125069#c2>, "rawhide" meant what was going to be branched as fedora 37, and the upstream patch had not been ported to f36.

Comment 3 Robbie Harwood 2023-03-27 17:50:26 UTC
It would have been nice to know that the kernel change was going to be backported before it was done, especially because there's only a month and change left in the fc36 lifetime.

Comment 4 Fedora Update System 2023-03-27 17:57:10 UTC
FEDORA-2023-f88489c118 has been submitted as an update to Fedora 36. https://bodhi.fedoraproject.org/updates/FEDORA-2023-f88489c118

Comment 5 Fedora Update System 2023-03-28 04:08:56 UTC
FEDORA-2023-f88489c118 has been pushed to the Fedora 36 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-f88489c118`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-f88489c118

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 6 Laszlo Ersek 2023-03-28 10:05:00 UTC
Thank you Robbie for the quick fix; the 6.2.7-100.fc36.aarch64 kernel can be booted fine by grub2 2.06-62.fc36.

Comment 7 Fedora Update System 2023-03-31 02:42:31 UTC
FEDORA-2023-f88489c118 has been pushed to the Fedora 36 stable repository.
If problem still persists, please make note of it in this bug report.