Bug 2167404 - Mt. Snow Ampere Altra fails to boot rawhide installer
Summary: Mt. Snow Ampere Altra fails to boot rawhide installer
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: aarch64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: ARMTracker
TreeView+ depends on / blocked
 
Reported: 2023-02-06 14:40 UTC by Jakub Čajka
Modified: 2023-03-06 19:47 UTC (History)
18 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed:
Type: Bug
Embargoed:


Attachments (Terms of Use)
Console log of the failed boot (163.66 KB, text/plain)
2023-02-06 14:40 UTC, Jakub Čajka
no flags Details

Description Jakub Čajka 2023-02-06 14:40:20 UTC
Created attachment 1942540 [details]
Console log of the failed boot

1. Please describe the problem:
During PXE booting rawhide installer the system(Ampere Altra) freezes. As in the console log is kernel BT present starting with kernel as component.

2. What is the Version-Release number of the kernel:
6.2.0-0.rc6.20230203git66a87fff1a87.47.fc38.aarch64

3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :
No, both Fedora 36 and 37 installers successfully boot in to anaconda.

4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:
Yes, always.
PXE boot the latest rawhide installer on UEFI aarch64(?, Mt.Snow in my case)

5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:
N/A

6. Are you running any modules that not shipped with directly Fedora's kernel?:
No

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

Comment 1 Jeremy Linton 2023-03-06 19:42:03 UTC
So, given this:

[    8.785307] ------------[ cut here ]------------
[    8.788686] NetLabel:  domain hash size = 128
[    8.793286] WARNING: CPU: 2 PID: 430 at drivers/firmware/efi/runtime-wrappers.c:113 efi_call_virt_check_flags+0x48/0xb0
[    8.797629] NetLabel:  protocols = UNLABELED CIPSOv4 CALIPSO
[    8.808396] Modules linked in:
[    8.814055] NetLabel:  unlabeled traffic allowed by default
[    8.817084] CPU: 2 PID: 430 Comm: kworker/u160:2 Tainted: G          I       -------  ---  6.2.0-0.rc6.20230203git66a87fff1a87.47.fc38.aarch64 #1
[    8.822646] mctp: management component transport protocol core
[    8.835665] Hardware name: GIGABYTE R152-P31-00/MP32-AR1-00, BIOS F31h (SCP: 2.10.20220531) 07/27/2022
[    8.835666] Workqueue: efi_rts_wq efi_call_rts
[    8.841485] NET: Registered PF_MCTP protocol family
[    8.850775] 
[    8.850776] pstate: 000000c9 (nzcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    8.855273] pci 0004:02:00.0: vgaarb: setting as boot VGA device
[    8.860069] pc : efi_call_virt_check_flags+0x48/0xb0
[    8.861548] pci 0004:02:00.0: vgaarb: bridge control possible
[    8.868495] lr : efi_call_rts+0x3b0/0x4c0
[    8.874488] pci 0004:02:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[    8.879438] sp : ffff800008333d20
[    8.885172] vgaarb: loaded
[    8.889166] x29: ffff800008333d20 x28: 0000000000000000 x27: 0000000000000000
[    8.910618] x26: 0000000000000000 x25: ffffb93e70aa8a30 x24: ffff80000832bd88
[    8.917740] x23: ffff80000832bd40 x22: ffffb93e6f5b2850 x21: 00000000000000c0
[    8.924863] x20: ffffb93e6f5b2850 x19: 0000000000000000 x18: 0000000000000008
[    8.931985] x17: 3030303530303030 x16: ffff80000a3ac000 x15: 0000020000000000
[    8.939107] x14: 0000000000000000 x13: 0000000000000010 x12: 0101010101010101
[    8.946230] x11: 7f7f7f7f7f7f7f7f x10: fefefefefeff7076 x9 : ffffb93e6ea2f920
[    8.953352] x8 : 00000000f7e40810 x7 : ffff80000a3abe80 x6 : ffff80000a3abf20
[    8.960474] x5 : ffff80000a3abe78 x4 : 00000000f7e71408 x3 : 00000000f7eb45d8
[    8.967596] x2 : 0000000001000000 x1 : ffffb93e6f5b2850 x0 : 00000000000000c0
[    8.974719] Call trace:
[    8.977152]  efi_call_virt_check_flags+0x48/0xb0
[    8.981757]  efi_call_rts+0x3b0/0x4c0
[    8.985407]  process_one_work+0x1e8/0x480
[    8.989405]  worker_thread+0x74/0x410
[    8.993054]  kthread+0xe8/0xf4
[    8.996096]  ret_from_fork+0x10/0x20
[    8.999659] ---[ end trace 0000000000000000 ]---
[    9.004263] Disabling lock debugging due to kernel taint
[    9.009560] efi: [Firmware Bug]: IRQ flags corrupted (0x00000000=>0x000000c0) by EFI set_variable
[    9.018458] ------------[ cut here ]------------

Its probably caused by the SetVirtualAddress map changes recently. I would make sure you have the latest firmware, and then check to see if this:
is in that the boot kernel.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=190233164cd77115f8dea718cbac561f557092c6

Comment 2 Jeremy Linton 2023-03-06 19:42:47 UTC
The other thing to check, is that the console hasn't moved elsewhere (aka a virtual serial port/etc) since it appears to be mostly booting.

Comment 3 Jeremy Linton 2023-03-06 19:47:35 UTC
Yeah, so I suspect you need 6.2rc8+ for that tweak. Although I'm a bit surprised that your rawhide image has that old of a kernel, you might look for a newer compose. The versions on my rawhide machine have all moved to fc39.


Note You need to log in before you can comment on or make changes to this bug.