Bug 2290441

Summary: UEFI PXE boot fails with "error: ../../grub-core/net/net.c:1801:timeout reading"
Product: [Fedora] Fedora Reporter: Yasmin de Souza <ydesouza>
Component: edk2Assignee: Paolo Bonzini <pbonzini>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 40CC: berrange, crobinso, jlebon, kraxel, pbonzini, travier, virt-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: edk2-20240524-3.fc40 edk2-20240524-2.fc39 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-06-11 01:50:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Yasmin de Souza 2024-06-04 19:55:54 UTC
Description of problem:

We are facing an issue running FCOS pipeline across all streams. The failure happens during the test ISO phase. FCOS PXE 4K UEFI tests started to fail after edk2-ovmf updated to 20240524-1.fc40. 

Version-Release number of selected component (if applicable):
20240524-1.fc40

LOGS:

[2024-06-03T22:53:15.279Z] Running test: pxe-offline-install.4k.uefi
[2024-06-03T23:03:21.890Z] FAIL: pxe-offline-install.4k.uefi (10m0.005s)
[2024-06-03T23:03:21.890Z]     timed out after 10m0s
[2024-06-03T23:03:21.890Z] Running test: pxe-online-install.bios
[2024-06-03T23:04:43.214Z] PASS: pxe-online-install.bios (1m29.289s)
[2024-06-03T23:04:43.214Z] Running test: pxe-online-install.4k.uefi
[2024-06-03T23:14:49.808Z] FAIL: pxe-online-install.4k.uefi (10m0.001s)
[2024-06-03T23:14:49.808Z]     timed out after 10m0s
[2024-06-03T23:14:49.808Z] Error: harness: test suite failed
[2024-06-03T23:14:49.808Z] 2024-06-03T23:14:43Z cli: harness: test suite failed
[2024-06-03T23:14:49.808Z] failed to execute cmd-kola: exit status 1

Loading kernel
error: ../../grub-core/net/net.c:1801:timeout reading
`/fedora-coreos-40.20240603.dev.0-live-kernel-x86_64'.

Comment 1 Jonathan Lebon 2024-06-04 20:06:56 UTC
Re-targeted against Fedora 40.

For more context, the Fedora CoreOS CI runs PXE UEFI installation tests and we've found that the recent EDK2 update (edk2-ovmf-20240524-1.fc40) is causing boot failures in the VMs, apparently timing out trying to read the kernel. Full logs:

```
3h3hBdsDxe: failed to load Boot0001 "UEFI Misc Device" from PciRoot(0x0)/Pci(0x4,0x0): Not Found

>>Start PXE over IPv4.
  Station IP address is 192.168.76.9

  Server IP address is 192.168.76.2
  NBP filename is /boot/grub2/grubx64.efi
  NBP filesize is 3972416 Bytes
 Downloading NBP file...

  NBP file downloaded successfully.
BdsDxe: loading Boot0002 "UEFI PXEv4 (MAC:525400123456)" from PciRoot(0x0)/Pci(0x3,0x0)/MAC(525400123456,0x1)/IPv4(0.0.0.0,0x0,DHCP,0.0.0.0,0.0.0.0,0.0.0.0)
BdsDxe: starting Boot0002 "UEFI PXEv4 (MAC:525400123456)" from PciRoot(0x0)/Pci(0x3,0x0)/MAC(525400123456,0x1)/IPv4(0.0.0.0,0x0,DHCP,0.0.0.0,0.0.0.0,0.0.0.0)
GRUB version 2.06

^M┌────────────────────────────────────────────────────────────────────────────┐││││││││││││││││││││││││││││││││││││││││││││││││││││││││││││││││││││││││││││└────────────────────────────────────────────────────────────────────────────┘     Use the ▲ and ▼ keys to select which entry is highlighted.
^M      Press enter to boot the selected OS, `e' to edit the commands
^M      before booting or `c' for a command-line.                            *CoreOS (BIOS/UEFI)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         The highlighted entry will be executed automatically in 1s.                    The highlighted entry will be executed automatically in 0s.                   Booting `CoreOS (BIOS/UEFI)'

^MLoading kernel
^Merror: ../../grub-core/net/net.c:1801:timeout reading
^M`/fedora-coreos-40.20240604.dev.0-live-kernel-x86_64'.
^MLoading initrd
^Merror: ../../grub-core/loader/i386/efi/linux.c:258:you need to load the kernel
^Mfirst.

^MPress any key to continue...
```

Compare to logs with edk2-ovmf-20240214-7.fc40:

```
3h3hBdsDxe: failed to load Boot0001 "UEFI Misc Device" from PciRoot(0x0)/Pci(0x4,0x0): Not Found

>>Start PXE over IPv4.
  Station IP address is 192.168.76.9

  Server IP address is 192.168.76.2
  NBP filename is /boot/grub2/grubx64.efi
  NBP filesize is 3972416 Bytes
 Downloading NBP file...

  NBP file downloaded successfully.
BdsDxe: loading Boot0002 "UEFI PXEv4 (MAC:525400123456)" from PciRoot(0x0)/Pci(0x3,0x0)/MAC(525400123456,0x1)/IPv4(0.0.0.0,0x0,DHCP,0.0.0.0,0.0.0.0,0.0.0.0)
BdsDxe: starting Boot0002 "UEFI PXEv4 (MAC:525400123456)" from PciRoot(0x0)/Pci(0x3,0x0)/MAC(525400123456,0x1)/IPv4(0.0.0.0,0x0,DHCP,0.0.0.0,0.0.0.0,0.0.0.0)
GRUB version 2.06

^M┌────────────────────────────────────────────────────────────────────────────┐││││││││││││││││││││││││││││││││││││││││││││││││││││││││││││││││││││││││││││└────────────────────────────────────────────────────────────────────────────┘     Use the ▲ and ▼ keys to select which entry is highlighted.
^M      Press enter to boot the selected OS, `e' to edit the commands
^M      before booting or `c' for a command-line.                            *CoreOS (BIOS/UEFI)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         The highlighted entry will be executed automatically in 1s.                    The highlighted entry will be executed automatically in 0s.                   Booting `CoreOS (BIOS/UEFI)'

^MLoading kernel
^MLoading initrd
...
```

GRUB binaries are from grub2-2.06-123.fc40.noarch

We can provide more details if you'd like to reproduce our setup locally.

Comment 2 Gerd Hoffmann 2024-06-05 07:42:12 UTC
Probably the same issue as bug 2290388.

So loading grub.efi works.  grub.efi apparently also manages to load grub.cfg.  Fetching the kernel fails though.
How do you load the kernel?  tftp?  http?

Comment 3 Gerd Hoffmann 2024-06-05 09:10:49 UTC
Hmm, doesn't reproduce here.  I do see a significant slowdown though, time needed to load an f40 kernel via tftp goes up from ~20 to more than 40 seconds.

Comment 4 Gerd Hoffmann 2024-06-05 09:20:54 UTC
Noteworthy detail: There are a number of files grub tries to fetch, but the set of files is different on each boot (independent from the firmware version):

# journalctl -b0 -u tftp.service | grep RRQ | sed -e 's/.*RRQ//' | sort | uniq -c 
     20  from 192.168.105.158 filename /arch-x86_64/grub.cfg
      2  from 192.168.105.158 filename /arch-x86_64/grub.cfg-01-52-54-00-12-34-56
      2  from 192.168.105.158 filename /arch-x86_64/grub.cfg-C
      2  from 192.168.105.158 filename /arch-x86_64/grub.cfg-C0A
      1  from 192.168.105.158 filename /arch-x86_64/grub.cfg-C0A8
      1  from 192.168.105.158 filename /arch-x86_64/grub.cfg-C0A86
      2  from 192.168.105.158 filename /arch-x86_64/grub.cfg-C0A869
      2  from 192.168.105.158 filename /arch-x86_64/grub.cfg-C0A8699
      1  from 192.168.105.158 filename /arch-x86_64/grub.cfg-C0A8699E
     20  from 192.168.105.158 filename /arch-x86_64/grubx64.efi
      2  from 192.168.105.158 filename /EFI/fedora/x86_64-efi/command.lst
      4  from 192.168.105.158 filename /EFI/fedora/x86_64-efi/fs.lst
      1  from 192.168.105.158 filename /EFI/fedora/x86_64-efi/terminal.lst

Comment 5 Gerd Hoffmann 2024-06-05 10:33:49 UTC
(In reply to Gerd Hoffmann from comment #3)
> Hmm, doesn't reproduce here.  I do see a significant slowdown though, time
> needed to load an f40 kernel via tftp goes up from ~20 to more than 40
> seconds.

Bisect lands at https://github.com/tianocore/edk2/commit/1904a64
Checking the git log finds https://github.com/tianocore/edk2/commit/ced13b9

Comment 7 Fedora Update System 2024-06-05 12:17:12 UTC
FEDORA-2024-3446df5831 (edk2-20240524-3.fc40) has been submitted as an update to Fedora 40.
https://bodhi.fedoraproject.org/updates/FEDORA-2024-3446df5831

Comment 8 Fedora Update System 2024-06-05 13:30:30 UTC
FEDORA-2024-773ea76c63 (edk2-20240524-2.fc39) has been submitted as an update to Fedora 39.
https://bodhi.fedoraproject.org/updates/FEDORA-2024-773ea76c63

Comment 9 Fedora Update System 2024-06-06 02:12:54 UTC
FEDORA-2024-3446df5831 has been pushed to the Fedora 40 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-3446df5831`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-3446df5831

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 10 Fedora Update System 2024-06-06 03:09:18 UTC
FEDORA-2024-773ea76c63 has been pushed to the Fedora 39 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-773ea76c63`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-773ea76c63

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 11 Fedora Update System 2024-06-11 01:50:10 UTC
FEDORA-2024-3446df5831 (edk2-20240524-3.fc40) has been pushed to the Fedora 40 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 12 Fedora Update System 2024-06-21 01:15:29 UTC
FEDORA-2024-773ea76c63 (edk2-20240524-2.fc39) has been pushed to the Fedora 39 stable repository.
If problem still persists, please make note of it in this bug report.